https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71509

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Bitfield extraction on ppc64le goes the

      /* Try loading part of OP0 into a register and extracting the
         bitfield from that.  */
      unsigned HOST_WIDE_INT bitpos;
      rtx xop0 = adjust_bit_field_mem_for_reg (pattern, op0, bitsize, bitnum,
                                               0, 0, tmode, &bitpos);

way which ends up generating the DImode load using the fact that the struct
alignment adds padding after csum_level.  The store path OTOH ends up
honoring the C++ mem model which says to access the bitfield in ints declared
type (IIRC?) and the bit region via DECL_BIT_FIELD_REPRESENTATIVE is of size 8
(because of C++ inheritance that tail padding can be re-used).

It looks like we didn't adjust the bitfield read paths for the mem model
because in practice it doesn't matter and it may generate larger/slower code
not to do loads in larger types on some archs.

This leads to the observed load-store / store-load issues.

Note that we conservatively compute the extent for
DECL_BIT_FIELD_REPRESENTATIVE
by prefering smaller modes.  There's some ??? in finish_bitfield_representative
and the above remark about tail padding re-use is only implemented via
prefering
smaller modes.  Thus when adding a 'long foo' after csum_level the
representative doesn't change to 64bit width but stays at 8bits (both are valid
from the C++ memory model).

Note that the proposed simple lowering of bitfield accesses on GIMPLE would
do accesses of DECL_BIT_FIELD_REPRESENTATIVE and thus in this case use byte
accesses.

I suppose we want to be less conservative about DECL_BIT_FIELD_REPRESENTATIVE
and leave it up to the target how to do the actual accesses.

Widening the representative generates

__skb_decr_checksum_unnecessary:
        ld 9,8(3)
        addi 10,9,3
        rldicr 9,9,0,61
        rldicl 10,10,0,62
        or 9,9,10
        std 9,8(3)
        blr

Reply via email to