* Dave Hansen <[email protected]> wrote:

> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > +/*
> > + * This is our most modern FPU state format, as saved by the XSAVE
> > + * and restored by the XRSTOR instructions.
> > + *
> > + * It consists of a legacy fxregs portion, an xstate header and
> > + * subsequent fixed size areas as defined by the xstate header.
> > + * Not all CPUs support all the extensions.
> > + */
> >  struct xregs_state {
> >     struct fxregs_state             i387;
> >     struct xstate_header            header;
> > @@ -150,6 +169,13 @@ struct xregs_state {
> >     /* New processor state extensions will go here. */
> >  } __attribute__ ((packed, aligned (64)));
> 
> Fenghua has a "fix" for this, but I think this misses a pretty big point.
> 
> This structure includes only the "legacy" state, followed by the header.
>  The remainder of the layout here is enumerated in CPUID leaves and can
> not be laid out in a structure because we do not know what it looks like
> until we run CPUID.
> 
> There is logically a variable length array at the end of this 
> sucker.

Yes, exactly, that is where we want to go, and this direction is what 
I tried to cover with this bit of the series:

  struct xregs_state {
        struct fxregs_state             i387;
        struct xstate_header            header;
        u8                              __reserved[XSTATE_RESERVE];
  } __attribute__ ((packed, aligned (64)));

Note how it's now opaque after the xstate header, because there's no 
guarantee of what's in that area.

The only 'fixed' aspect of the xstates is the feature bit enumeration:

enum xfeature_bit {
        XSTATE_BIT_FP,
        XSTATE_BIT_SSE,
        XSTATE_BIT_YMM,
        XSTATE_BIT_BNDREGS,
        XSTATE_BIT_BNDCSR,
        XSTATE_BIT_OPMASK,
        XSTATE_BIT_ZMM_Hi256,
        XSTATE_BIT_Hi16_ZMM,

        XFEATURES_NR_MAX,

Plus with point #4 of the announcement I wanted to signal that I think 
we should allocate the variable part dynamically:

   4)

   task->thread.fpu->state got embedded again, as 
   task->thread.fpu.state. This eliminated a lot of awkward late 
   dynamic memory allocation of FPU state and the problematic handling 
   of failures.

   Note that while the allocation is static right now, this is a WIP 
   interim state: we can still do dynamic allocation of FPU state, by 
   moving the FPU state last in task_struct and then allocating 
   task_struct accordingly.

I.e. we can put the variable size state array at the end of 
task_struct, make task_struct size per boot variable and still have 
essentially a single static allocation for all fundamental task state.

But I first wanted to see people test this series - it's ambitious 
enough as-is already!

Thanks,

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to