andykaylor wrote:

Sorry for the delay in getting back to this. The summary of what I am about to 
say is that I'm in agreement with the plan of moving forward with the current 
implementation, but I have some lingering thoughts about it that I want to 
explore.

> As you noticed, we went higher level for unions (kept the other types around) 
> but didn't for bitfields. I don't see it as inconsistent because in my point 
> of view they are orthogonal (even though you can compose them). Bitfields can 
> get tricky, so we decided to go the "fast path" that would minimize 
> differences between OG and CIRGen and be more like LLVM until we get a good 
> use case for a richer representation (so that the design could incorporate 
> real needs). Note that I'm not against improving it, but we just decided not 
> to spend time there with the resources we had.

For the record, I don't actually have a technical use case for representing 
bitfields differently. My "use case" is me, personally, looking at the CIR 
generated for unions and bitfields and trying to understand what it means.

For a union:
```
union U { int a; short b; };
short foo(U u) { return u.b; }
```
```
!rec_U = !cir.record<union "U" {!s32i, !s16i}>
...
%2 = cir.get_member %0[1] {name = "b"}
```

For a bitfield:
```
struct S { int x; unsigned a : 4; unsigned b : 4; };
short foo(S s) { return s.b; }
```
```
  !rec_S = !cir.record<struct "S" {!s32i, !u8i}>
  ...
  %2 = cir.get_member %0[1] {name = "b"}
  %3 = cir.get_bitfield(#bfi_b, %2 : !cir.ptr<!u8i>)
```

I don't like the fact that every member from the source code is represented in 
the record type for the union (even though they are not distinct in the memory 
layout of the type) but not every member from the source code in represented in 
the record type for the bitfield (even though they are, in a sense, distinct in 
the memory layout of the type). This feels incongruous to me. In particular, 
the 'index' operand to `cir.get_member` means something entirely different for 
a union than it does for a struct with bitfields.

To be clear, this is a matter of idealism. There's no doubt that I can write 
code to handle both types, and all the information I want is available. A 
program processing CIR probably won't have a problem with this. It's more of an 
issue for a person reading the CIR in text form and trying to understand it. I 
guess the most I could say is that the difference between them could lead a 
developer to introduce a bug due to their own misunderstanding.

I also don't like the fact that it takes two CIR instructions to retrieve a 
field from the record in the bitfield case. I understand why it's like that, 
but I don't like it.

Perhaps a more practical issue here is when we are resolving the logical intent 
of the source code to ABI-compliant details. I haven't dug into this very 
deeply, so I don't know all the ABI rules for laying out bitfields, but I 
understand that it's not trivial. Ultimately, we want people performing 
transformations on CIR to be able to reason about the logical intent of the 
code without making any changes that will break the ABI-compliant lowering. I 
don't necessarily know what that may mean, but I think we do need to know the 
memory layout of the record in order to optimize bitfield access or even to 
analyze problems related to bitfield access, so I guess some amount of early 
ABI handling is necessary.

https://github.com/llvm/llvm-project/pull/142041
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to