On Thu, Nov 08, 2018 at 03:35:02PM -0800, Dave Hansen wrote:
> On 11/8/18 2:00 PM, Matthew Wilcox wrote:
> > struct a {
> >     char c;
> >     struct b b;
> > };
> > 
> > we want struct b to start at offset 8, but with __packed, it will start
> > at offset 1.
> 
> You're talking about how we want the struct laid out in memory if we
> have control over the layout.  I'm talking about what happens if
> something *else* tells us the layout, like a hardware specification
> which is what is in play with the XSAVE instruction dictated layout
> that's in question here.
> 
> What I'm concerned about is a structure like this:
> 
> struct foo {
>         u32 i1;
>         u64 i2;
> };
> 
> If we leave that to natural alignment, we end up with a 16-byte
> structure laid out like this:
> 
>       0-3     i1
>       3-8     alignment gap
>       8-15    i2

I know you actually meant:

        0-3     i1
        4-7     pad
        8-15    i2

> Which isn't what we want.  We want a 12-byte structure, laid out like this:
> 
>       0-3     i1
>       4-11    i2
> 
> Which we get with:
> 
> struct foo {
>         u32 i1;
>         u64 i2;
> } __packed;

But we _also_ get pessimised accesses to i1 and i2.  Because gcc can't
rely on struct foo being aligned to a 4 or even 8 byte boundary (it
might be embedded in "struct a" from above).

> Now, looking at Yu-cheng's specific example, it doesn't matter.  We've
> got 64-bit types and natural 64-bit alignment.  Without __packed, we
> need to look out for natural alignment screwing us up.  With __packed,
> it just does what it *looks* like it does.

The question is whether Yu-cheng's struct is ever embedded in another
struct.  And if so, what does the hardware do?

Reply via email to