andykaylor wrote: Sorry for the delay in getting back to this. The summary of what I am about to say is that I'm in agreement with the plan of moving forward with the current implementation, but I have some lingering thoughts about it that I want to explore.
> As you noticed, we went higher level for unions (kept the other types around) > but didn't for bitfields. I don't see it as inconsistent because in my point > of view they are orthogonal (even though you can compose them). Bitfields can > get tricky, so we decided to go the "fast path" that would minimize > differences between OG and CIRGen and be more like LLVM until we get a good > use case for a richer representation (so that the design could incorporate > real needs). Note that I'm not against improving it, but we just decided not > to spend time there with the resources we had. For the record, I don't actually have a technical use case for representing bitfields differently. My "use case" is me, personally, looking at the CIR generated for unions and bitfields and trying to understand what it means. For a union: ``` union U { int a; short b; }; short foo(U u) { return u.b; } ``` ``` !rec_U = !cir.record<union "U" {!s32i, !s16i}> ... %2 = cir.get_member %0[1] {name = "b"} ``` For a bitfield: ``` struct S { int x; unsigned a : 4; unsigned b : 4; }; short foo(S s) { return s.b; } ``` ``` !rec_S = !cir.record<struct "S" {!s32i, !u8i}> ... %2 = cir.get_member %0[1] {name = "b"} %3 = cir.get_bitfield(#bfi_b, %2 : !cir.ptr<!u8i>) ``` I don't like the fact that every member from the source code is represented in the record type for the union (even though they are not distinct in the memory layout of the type) but not every member from the source code in represented in the record type for the bitfield (even though they are, in a sense, distinct in the memory layout of the type). This feels incongruous to me. In particular, the 'index' operand to `cir.get_member` means something entirely different for a union than it does for a struct with bitfields. To be clear, this is a matter of idealism. There's no doubt that I can write code to handle both types, and all the information I want is available. A program processing CIR probably won't have a problem with this. It's more of an issue for a person reading the CIR in text form and trying to understand it. I guess the most I could say is that the difference between them could lead a developer to introduce a bug due to their own misunderstanding. I also don't like the fact that it takes two CIR instructions to retrieve a field from the record in the bitfield case. I understand why it's like that, but I don't like it. Perhaps a more practical issue here is when we are resolving the logical intent of the source code to ABI-compliant details. I haven't dug into this very deeply, so I don't know all the ABI rules for laying out bitfields, but I understand that it's not trivial. Ultimately, we want people performing transformations on CIR to be able to reason about the logical intent of the code without making any changes that will break the ABI-compliant lowering. I don't necessarily know what that may mean, but I think we do need to know the memory layout of the record in order to optimize bitfield access or even to analyze problems related to bitfield access, so I guess some amount of early ABI handling is necessary. https://github.com/llvm/llvm-project/pull/142041 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits