Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* H. Peter Anvin  wrote:

> On 07/16/2015 12:14 PM, Dave Hansen wrote:
> > The FPU rewrite removed the dynamic allocations of 'struct fpu'.
> > But, this potentially wastes massive amounts of memory (2k per
> > task on systems that do not have AVX-512 for instance).
> > 
> > Instead of having a separate slab, this patch just appends the
> > space that we need to the 'task_struct' which we dynamically
> > allocate already.  This saves from doing an extra slab allocation
> > at fork().  The only real downside here is that we have to stick
> > everything and the end of the task_struct.  But, I think the
> > BUILD_BUG_ON()s I stuck in there should keep that from being too
> > fragile.
> > 
> > This survives a quick build and boot in a VM.  Does anyone see any
> > real downsides to this?
> 
> No.  I have also long advocated for merging task_struct and thread_info into 
> a 
> common structure and get it off the stack; it would improve security and 
> avoid 
> weird corner cases in the irqstack handling.

Note that we have 3 related 'task state' data structures with overlapping 
purpose:

  task_struct
   thread_struct
  thread_info

where thread_struct is embedded in task_struct currently.

So to turn it all into a single structure we'd have to merge thread_info into 
thread_struct. thread_info was put on the kernel stack due to the ESP trick we 
played long ago - but that is moot these days.

So I think what we want is not some common structure, but to actually merge all 
of 
thread_info into thread_struct for arch details and into task_struct for 
generic 
fields, and only have:

  task_struct/* generic fields */
   thread_struct /* arch details */

this can be done gradually, field by field, and in the end thread_info can be 
eliminated altogether.

The only real complication is that it affects every architecture. The good news 
is 
that most of the thread_info layout details are wrapped in various constructs 
like 
test_ti_thread_flag() and task_thread_info().

While at it we might as well rename 'thread_struct' to 'arch_task_struct':

  task_struct  /* generic fields */
  arch_task_struct /* arch details */

to make it really clear and easy to understand at a glance - as the current 
naming 
is has become ambiguous and slightly confusing the moment we introduced 
threading.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* Dave Hansen  wrote:

> >>  #endif /* _ASM_X86_FPU_H */
> >> diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 
> >> arch/x86/kernel/process.c
> >> --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 
> >> 10:50:42.360571875 -0700
> >> +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700
> >> @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist
> >>   */
> >>  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
> >>  {
> >> -   *dst = *src;
> >> +   memcpy(dst, src, arch_task_struct_size());
> > 
> > This is actually vaguely performance-critical, which makes me thing
> > that using some kind of inline or other real way (config macro, ifdef,
> > etc) to detect whether there's an arch override would be better than a
> > weak function.
> 
> Fair enough.  I'll send out another version in a bit if there are no more 
> comments.

Beyond making it a build time switch for other architectures, I'd also suggest 
introducing a __read_mostly task_struct_size variable on x86, so that we can 
write:

memcpy(dst, src, task_struct_size);

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote:
> > +++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700
> > @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
> >  unsigned int xstate_size;
> >  EXPORT_SYMBOL_GPL(xstate_size);
> >  
> > +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
> > +   BUILD_BUG_ON((sizeof(TYPE) -\
> > +   offsetof(TYPE, MEMBER) -\
> > +   sizeof(((TYPE *)0)->MEMBER)) >  \
> > +   0)  \
> > +
> > +/*
> > + * We append the 'struct fpu' to the task_struct.
> > + */
> > +int __weak arch_task_struct_size(void)
> > +{
> > +   int task_size = sizeof(struct task_struct);
> > +
> > +   /*
> > +* Subtract off the static size of the register state.
> > +* It potentially has a bunch of padding.
> > +*/
> > +   task_size -= sizeof(((struct task_struct *)0)->thread.fpu.state);
> > +
> > +   /*
> > +* Add back the dynamically-calculated register state
> > +* size.
> > +*/
> > +   task_size += xstate_size;
> > +
> > +   /*
> > +* We dynamically size 'struct fpu', so we require that
> > +* it be at the end of 'thread_struct' and that
> > +* 'thread_struct' be at the end of 'task_struct'.  If
> > +* you hit a compile error here, check the structure to
> > +* see if something got added to the end.
> > +*/
> > +   CHECK_MEMBER_AT_END_OF(struct fpu, state);
> > +   CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu);
> > +   CHECK_MEMBER_AT_END_OF(struct task_struct, thread);
> > +
> > +   return task_size;
> > +}
> 
> Since you want these invariants true at all times, maybe put the
> BUILD_BUG_ON() in generic code instead of x86 specific? That way people
> poking at other archs are less likely to accidentally break your stuff.

Yeah.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* Dave Hansen  wrote:

> The FPU rewrite removed the dynamic allocations of 'struct fpu'.
> But, this potentially wastes massive amounts of memory (2k per
> task on systems that do not have AVX-512 for instance).
> 
> Instead of having a separate slab, this patch just appends the
> space that we need to the 'task_struct' which we dynamically
> allocate already.  This saves from doing an extra slab allocation
> at fork().  The only real downside here is that we have to stick
> everything and the end of the task_struct.  But, I think the
> BUILD_BUG_ON()s I stuck in there should keep that from being too
> fragile.
> 
> This survives a quick build and boot in a VM.  Does anyone see any
> real downsides to this?

So considering the complexity of the other patch that makes the static 
allocation, 
I'd massively prefer this patch as it solves the real bug.

It should also work on future hardware a lot better.

This was the dynamic approach I suggested in our discussion of the big FPU code 
rework.

> --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu  
> 2015-07-16 10:50:42.355571648 -0700
> +++ b/arch/x86/kernel/fpu/init.c  2015-07-16 12:02:15.284280976 -0700
> @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
>  unsigned int xstate_size;
>  EXPORT_SYMBOL_GPL(xstate_size);
>  
> +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \
> + BUILD_BUG_ON((sizeof(TYPE) -\
> + offsetof(TYPE, MEMBER) -\
> + sizeof(((TYPE *)0)->MEMBER)) >  \
> + 0)  \
> +
> +/*
> + * We append the 'struct fpu' to the task_struct.
> + */
> +int __weak arch_task_struct_size(void)

This should not be __weak, otherwise we risk getting the generic version:

> --- a/kernel/fork.c~dynamically-allocate-struct-fpu   2015-07-16 
> 10:50:42.357571739 -0700
> +++ b/kernel/fork.c   2015-07-16 11:25:53.873498188 -0700
> @@ -287,15 +287,21 @@ static void set_max_threads(unsigned int
>   max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS);
>  }
>  
> +int __weak arch_task_struct_size(void)
> +{
> + return sizeof(struct task_struct);
> +}
> +

Your system probably worked due to link order preferring the x86 version but 
I'm 
not sure.

Other than this bug it looks good to me in principle.

Lemme check it on various hardware.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote:
  +++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700
  @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
   unsigned int xstate_size;
   EXPORT_SYMBOL_GPL(xstate_size);
   
  +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
  +   BUILD_BUG_ON((sizeof(TYPE) -\
  +   offsetof(TYPE, MEMBER) -\
  +   sizeof(((TYPE *)0)-MEMBER))   \
  +   0)  \
  +
  +/*
  + * We append the 'struct fpu' to the task_struct.
  + */
  +int __weak arch_task_struct_size(void)
  +{
  +   int task_size = sizeof(struct task_struct);
  +
  +   /*
  +* Subtract off the static size of the register state.
  +* It potentially has a bunch of padding.
  +*/
  +   task_size -= sizeof(((struct task_struct *)0)-thread.fpu.state);
  +
  +   /*
  +* Add back the dynamically-calculated register state
  +* size.
  +*/
  +   task_size += xstate_size;
  +
  +   /*
  +* We dynamically size 'struct fpu', so we require that
  +* it be at the end of 'thread_struct' and that
  +* 'thread_struct' be at the end of 'task_struct'.  If
  +* you hit a compile error here, check the structure to
  +* see if something got added to the end.
  +*/
  +   CHECK_MEMBER_AT_END_OF(struct fpu, state);
  +   CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu);
  +   CHECK_MEMBER_AT_END_OF(struct task_struct, thread);
  +
  +   return task_size;
  +}
 
 Since you want these invariants true at all times, maybe put the
 BUILD_BUG_ON() in generic code instead of x86 specific? That way people
 poking at other archs are less likely to accidentally break your stuff.

Yeah.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* Dave Hansen d...@sr71.net wrote:

   #endif /* _ASM_X86_FPU_H */
  diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 
  arch/x86/kernel/process.c
  --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 
  10:50:42.360571875 -0700
  +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700
  @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist
*/
   int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
   {
  -   *dst = *src;
  +   memcpy(dst, src, arch_task_struct_size());
  
  This is actually vaguely performance-critical, which makes me thing
  that using some kind of inline or other real way (config macro, ifdef,
  etc) to detect whether there's an arch override would be better than a
  weak function.
 
 Fair enough.  I'll send out another version in a bit if there are no more 
 comments.

Beyond making it a build time switch for other architectures, I'd also suggest 
introducing a __read_mostly task_struct_size variable on x86, so that we can 
write:

memcpy(dst, src, task_struct_size);

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* Dave Hansen d...@sr71.net wrote:

 The FPU rewrite removed the dynamic allocations of 'struct fpu'.
 But, this potentially wastes massive amounts of memory (2k per
 task on systems that do not have AVX-512 for instance).
 
 Instead of having a separate slab, this patch just appends the
 space that we need to the 'task_struct' which we dynamically
 allocate already.  This saves from doing an extra slab allocation
 at fork().  The only real downside here is that we have to stick
 everything and the end of the task_struct.  But, I think the
 BUILD_BUG_ON()s I stuck in there should keep that from being too
 fragile.
 
 This survives a quick build and boot in a VM.  Does anyone see any
 real downsides to this?

So considering the complexity of the other patch that makes the static 
allocation, 
I'd massively prefer this patch as it solves the real bug.

It should also work on future hardware a lot better.

This was the dynamic approach I suggested in our discussion of the big FPU code 
rework.

 --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu  
 2015-07-16 10:50:42.355571648 -0700
 +++ b/arch/x86/kernel/fpu/init.c  2015-07-16 12:02:15.284280976 -0700
 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
  unsigned int xstate_size;
  EXPORT_SYMBOL_GPL(xstate_size);
  
 +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \
 + BUILD_BUG_ON((sizeof(TYPE) -\
 + offsetof(TYPE, MEMBER) -\
 + sizeof(((TYPE *)0)-MEMBER))   \
 + 0)  \
 +
 +/*
 + * We append the 'struct fpu' to the task_struct.
 + */
 +int __weak arch_task_struct_size(void)

This should not be __weak, otherwise we risk getting the generic version:

 --- a/kernel/fork.c~dynamically-allocate-struct-fpu   2015-07-16 
 10:50:42.357571739 -0700
 +++ b/kernel/fork.c   2015-07-16 11:25:53.873498188 -0700
 @@ -287,15 +287,21 @@ static void set_max_threads(unsigned int
   max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS);
  }
  
 +int __weak arch_task_struct_size(void)
 +{
 + return sizeof(struct task_struct);
 +}
 +

Your system probably worked due to link order preferring the x86 version but 
I'm 
not sure.

Other than this bug it looks good to me in principle.

Lemme check it on various hardware.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-17 Thread Ingo Molnar

* H. Peter Anvin h...@zytor.com wrote:

 On 07/16/2015 12:14 PM, Dave Hansen wrote:
  The FPU rewrite removed the dynamic allocations of 'struct fpu'.
  But, this potentially wastes massive amounts of memory (2k per
  task on systems that do not have AVX-512 for instance).
  
  Instead of having a separate slab, this patch just appends the
  space that we need to the 'task_struct' which we dynamically
  allocate already.  This saves from doing an extra slab allocation
  at fork().  The only real downside here is that we have to stick
  everything and the end of the task_struct.  But, I think the
  BUILD_BUG_ON()s I stuck in there should keep that from being too
  fragile.
  
  This survives a quick build and boot in a VM.  Does anyone see any
  real downsides to this?
 
 No.  I have also long advocated for merging task_struct and thread_info into 
 a 
 common structure and get it off the stack; it would improve security and 
 avoid 
 weird corner cases in the irqstack handling.

Note that we have 3 related 'task state' data structures with overlapping 
purpose:

  task_struct
   thread_struct
  thread_info

where thread_struct is embedded in task_struct currently.

So to turn it all into a single structure we'd have to merge thread_info into 
thread_struct. thread_info was put on the kernel stack due to the ESP trick we 
played long ago - but that is moot these days.

So I think what we want is not some common structure, but to actually merge all 
of 
thread_info into thread_struct for arch details and into task_struct for 
generic 
fields, and only have:

  task_struct/* generic fields */
   thread_struct /* arch details */

this can be done gradually, field by field, and in the end thread_info can be 
eliminated altogether.

The only real complication is that it affects every architecture. The good news 
is 
that most of the thread_info layout details are wrapped in various constructs 
like 
test_ti_thread_flag() and task_thread_info().

While at it we might as well rename 'thread_struct' to 'arch_task_struct':

  task_struct  /* generic fields */
  arch_task_struct /* arch details */

to make it really clear and easy to understand at a glance - as the current 
naming 
is has become ambiguous and slightly confusing the moment we introduced 
threading.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Andy Lutomirski
On Thu, Jul 16, 2015 at 3:42 PM, H. Peter Anvin  wrote:
> On 07/16/2015 12:14 PM, Dave Hansen wrote:
>> The FPU rewrite removed the dynamic allocations of 'struct fpu'.
>> But, this potentially wastes massive amounts of memory (2k per
>> task on systems that do not have AVX-512 for instance).
>>
>> Instead of having a separate slab, this patch just appends the
>> space that we need to the 'task_struct' which we dynamically
>> allocate already.  This saves from doing an extra slab allocation
>> at fork().  The only real downside here is that we have to stick
>> everything and the end of the task_struct.  But, I think the
>> BUILD_BUG_ON()s I stuck in there should keep that from being too
>> fragile.
>>
>> This survives a quick build and boot in a VM.  Does anyone see any
>> real downsides to this?
>>
>
> No.  I have also long advocated for merging task_struct and thread_info
> into a common structure and get it off the stack; it would improve
> security and avoid weird corner cases in the irqstack handling.
>

In tip/x86/asm, entry_64.S only references thread_info in two places,
both in the syscall code.  I'm hoping that we can move that code into
C soon and that we can do the same thing to the 32-bit code.  If we do
that, then we could just move ti->flags into thread_struct.
Everything else should be easy, and then thread_info would be empty.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread H. Peter Anvin
On 07/16/2015 12:14 PM, Dave Hansen wrote:
> The FPU rewrite removed the dynamic allocations of 'struct fpu'.
> But, this potentially wastes massive amounts of memory (2k per
> task on systems that do not have AVX-512 for instance).
> 
> Instead of having a separate slab, this patch just appends the
> space that we need to the 'task_struct' which we dynamically
> allocate already.  This saves from doing an extra slab allocation
> at fork().  The only real downside here is that we have to stick
> everything and the end of the task_struct.  But, I think the
> BUILD_BUG_ON()s I stuck in there should keep that from being too
> fragile.
> 
> This survives a quick build and boot in a VM.  Does anyone see any
> real downsides to this?
> 

No.  I have also long advocated for merging task_struct and thread_info
into a common structure and get it off the stack; it would improve
security and avoid weird corner cases in the irqstack handling.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Peter Zijlstra
On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote:
> +++ b/arch/x86/kernel/fpu/init.c  2015-07-16 12:02:15.284280976 -0700
> @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
>  unsigned int xstate_size;
>  EXPORT_SYMBOL_GPL(xstate_size);
>  
> +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \
> + BUILD_BUG_ON((sizeof(TYPE) -\
> + offsetof(TYPE, MEMBER) -\
> + sizeof(((TYPE *)0)->MEMBER)) >  \
> + 0)  \
> +
> +/*
> + * We append the 'struct fpu' to the task_struct.
> + */
> +int __weak arch_task_struct_size(void)
> +{
> + int task_size = sizeof(struct task_struct);
> +
> + /*
> +  * Subtract off the static size of the register state.
> +  * It potentially has a bunch of padding.
> +  */
> + task_size -= sizeof(((struct task_struct *)0)->thread.fpu.state);
> +
> + /*
> +  * Add back the dynamically-calculated register state
> +  * size.
> +  */
> + task_size += xstate_size;
> +
> + /*
> +  * We dynamically size 'struct fpu', so we require that
> +  * it be at the end of 'thread_struct' and that
> +  * 'thread_struct' be at the end of 'task_struct'.  If
> +  * you hit a compile error here, check the structure to
> +  * see if something got added to the end.
> +  */
> + CHECK_MEMBER_AT_END_OF(struct fpu, state);
> + CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu);
> + CHECK_MEMBER_AT_END_OF(struct task_struct, thread);
> +
> + return task_size;
> +}

Since you want these invariants true at all times, maybe put the
BUILD_BUG_ON() in generic code instead of x86 specific? That way people
poking at other archs are less likely to accidentally break your stuff.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Dave Hansen
On 07/16/2015 12:25 PM, Andy Lutomirski wrote:
> On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen  wrote:
>> +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
>> +   BUILD_BUG_ON((sizeof(TYPE) -\
>> +   offsetof(TYPE, MEMBER) -\
>> +   sizeof(((TYPE *)0)->MEMBER)) >  \
>> +   0)  \
>> +
> 
> You could save a bit of typing by using offsetofend here.  Something
> along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE,
> MEMBER));

Good point.

>>  #endif /* _ASM_X86_FPU_H */
>> diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 
>> arch/x86/kernel/process.c
>> --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 
>> 10:50:42.360571875 -0700
>> +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700
>> @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist
>>   */
>>  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>>  {
>> -   *dst = *src;
>> +   memcpy(dst, src, arch_task_struct_size());
> 
> This is actually vaguely performance-critical, which makes me thing
> that using some kind of inline or other real way (config macro, ifdef,
> etc) to detect whether there's an arch override would be better than a
> weak function.

Fair enough.  I'll send out another version in a bit if there are no
more comments.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Andy Lutomirski
On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen  wrote:
>
> The FPU rewrite removed the dynamic allocations of 'struct fpu'.
> But, this potentially wastes massive amounts of memory (2k per
> task on systems that do not have AVX-512 for instance).
>
> Instead of having a separate slab, this patch just appends the
> space that we need to the 'task_struct' which we dynamically
> allocate already.  This saves from doing an extra slab allocation
> at fork().  The only real downside here is that we have to stick
> everything and the end of the task_struct.  But, I think the
> BUILD_BUG_ON()s I stuck in there should keep that from being too
> fragile.
>
> This survives a quick build and boot in a VM.  Does anyone see any
> real downsides to this?

Looks generally sensible.  Minor nitpicking below.

> +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
> +   BUILD_BUG_ON((sizeof(TYPE) -\
> +   offsetof(TYPE, MEMBER) -\
> +   sizeof(((TYPE *)0)->MEMBER)) >  \
> +   0)  \
> +

You could save a bit of typing by using offsetofend here.  Something
along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE,
MEMBER));

>  #endif /* _ASM_X86_FPU_H */
> diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 
> arch/x86/kernel/process.c
> --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 
> 10:50:42.360571875 -0700
> +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700
> @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist
>   */
>  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>  {
> -   *dst = *src;
> +   memcpy(dst, src, arch_task_struct_size());

This is actually vaguely performance-critical, which makes me thing
that using some kind of inline or other real way (config macro, ifdef,
etc) to detect whether there's an arch override would be better than a
weak function.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Dave Hansen

The FPU rewrite removed the dynamic allocations of 'struct fpu'.
But, this potentially wastes massive amounts of memory (2k per
task on systems that do not have AVX-512 for instance).

Instead of having a separate slab, this patch just appends the
space that we need to the 'task_struct' which we dynamically
allocate already.  This saves from doing an extra slab allocation
at fork().  The only real downside here is that we have to stick
everything and the end of the task_struct.  But, I think the
BUILD_BUG_ON()s I stuck in there should keep that from being too
fragile.

This survives a quick build and boot in a VM.  Does anyone see any
real downsides to this?

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: Peter Zijlstra 
Cc: Borislav Petkov 
Cc: Andy Lutomirski 
Cc: Linus Torvalds 
Cc: linux-kernel@vger.kernel.org

---

 b/arch/x86/include/asm/fpu/types.h |   72 +++--
 b/arch/x86/include/asm/processor.h |   10 +++--
 b/arch/x86/kernel/fpu/init.c   |   39 
 b/arch/x86/kernel/process.c|2 -
 b/fs/proc/kcore.c  |4 +-
 b/include/linux/sched.h|   12 +-
 b/kernel/fork.c|8 +++-
 7 files changed, 104 insertions(+), 43 deletions(-)

diff -puN arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu 
arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu  
2015-07-16 10:50:42.340570967 -0700
+++ b/arch/x86/include/asm/processor.h  2015-07-16 11:37:19.182539322 -0700
@@ -390,9 +390,6 @@ struct thread_struct {
 #endif
unsigned long   gs;
 
-   /* Floating point and extended processor state */
-   struct fpu  fpu;
-
/* Save middle states of ptrace breakpoints */
struct perf_event   *ptrace_bps[HBP_NUM];
/* Debug status used for traps, single steps, etc... */
@@ -418,6 +415,13 @@ struct thread_struct {
unsigned long   iopl;
/* Max allowed port in the bitmap, in bytes: */
unsignedio_bitmap_max;
+
+   /* Floating point and extended processor state */
+   struct fpu  fpu;
+   /*
+* WARNING: 'fpu' is dynamically-sized.  It *MUST* be at
+* the end.
+*/
 };
 
 /*
diff -puN include/linux/sched.h~dynamically-allocate-struct-fpu 
include/linux/sched.h
--- a/include/linux/sched.h~dynamically-allocate-struct-fpu 2015-07-16 
10:50:42.342571058 -0700
+++ b/include/linux/sched.h 2015-07-16 11:44:10.775174149 -0700
@@ -1522,8 +1522,6 @@ struct task_struct {
 /* hung task detection */
unsigned long last_switch_count;
 #endif
-/* CPU-specific state of this task */
-   struct thread_struct thread;
 /* filesystem information */
struct fs_struct *fs;
 /* open file information */
@@ -1778,8 +1776,18 @@ struct task_struct {
unsigned long   task_state_change;
 #endif
int pagefault_disabled;
+/* CPU-specific state of this task */
+   struct thread_struct thread;
+/*
+ * WARNING: on x86, 'thread_struct' contains a variable-sized
+ * structure.  It *MUST* be at the end of 'task_struct'.
+ *
+ * Do not put anything below here!
+ */
 };
 
+extern int arch_task_struct_size(void);
+
 /* Future-safe accessor for struct task_struct's cpus_allowed. */
 #define tsk_cpus_allowed(tsk) (&(tsk)->cpus_allowed)
 
diff -puN arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 
arch/x86/kernel/fpu/init.c
--- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu
2015-07-16 10:50:42.355571648 -0700
+++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700
@@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
 unsigned int xstate_size;
 EXPORT_SYMBOL_GPL(xstate_size);
 
+#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
+   BUILD_BUG_ON((sizeof(TYPE) -\
+   offsetof(TYPE, MEMBER) -\
+   sizeof(((TYPE *)0)->MEMBER)) >  \
+   0)  \
+
+/*
+ * We append the 'struct fpu' to the task_struct.
+ */
+int __weak arch_task_struct_size(void)
+{
+   int task_size = sizeof(struct task_struct);
+
+   /*
+* Subtract off the static size of the register state.
+* It potentially has a bunch of padding.
+*/
+   task_size -= sizeof(((struct task_struct *)0)->thread.fpu.state);
+
+   /*
+* Add back the dynamically-calculated register state
+* size.
+*/
+   task_size += xstate_size;
+
+   /*
+* We dynamically size 'struct fpu', so we require that
+* it be at the end of 'thread_struct' and that
+* 'thread_struct' be at the end of 'task_struct'.  If
+* you hit a compile error here, check the structure to
+* see if something got added to the end.
+*/
+   

Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Andy Lutomirski
On Thu, Jul 16, 2015 at 3:42 PM, H. Peter Anvin h...@zytor.com wrote:
 On 07/16/2015 12:14 PM, Dave Hansen wrote:
 The FPU rewrite removed the dynamic allocations of 'struct fpu'.
 But, this potentially wastes massive amounts of memory (2k per
 task on systems that do not have AVX-512 for instance).

 Instead of having a separate slab, this patch just appends the
 space that we need to the 'task_struct' which we dynamically
 allocate already.  This saves from doing an extra slab allocation
 at fork().  The only real downside here is that we have to stick
 everything and the end of the task_struct.  But, I think the
 BUILD_BUG_ON()s I stuck in there should keep that from being too
 fragile.

 This survives a quick build and boot in a VM.  Does anyone see any
 real downsides to this?


 No.  I have also long advocated for merging task_struct and thread_info
 into a common structure and get it off the stack; it would improve
 security and avoid weird corner cases in the irqstack handling.


In tip/x86/asm, entry_64.S only references thread_info in two places,
both in the syscall code.  I'm hoping that we can move that code into
C soon and that we can do the same thing to the 32-bit code.  If we do
that, then we could just move ti-flags into thread_struct.
Everything else should be easy, and then thread_info would be empty.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Dave Hansen
On 07/16/2015 12:25 PM, Andy Lutomirski wrote:
 On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen d...@sr71.net wrote:
 +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
 +   BUILD_BUG_ON((sizeof(TYPE) -\
 +   offsetof(TYPE, MEMBER) -\
 +   sizeof(((TYPE *)0)-MEMBER))   \
 +   0)  \
 +
 
 You could save a bit of typing by using offsetofend here.  Something
 along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE,
 MEMBER));

Good point.

  #endif /* _ASM_X86_FPU_H */
 diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 
 arch/x86/kernel/process.c
 --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 
 10:50:42.360571875 -0700
 +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700
 @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist
   */
  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
  {
 -   *dst = *src;
 +   memcpy(dst, src, arch_task_struct_size());
 
 This is actually vaguely performance-critical, which makes me thing
 that using some kind of inline or other real way (config macro, ifdef,
 etc) to detect whether there's an arch override would be better than a
 weak function.

Fair enough.  I'll send out another version in a bit if there are no
more comments.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Dave Hansen

The FPU rewrite removed the dynamic allocations of 'struct fpu'.
But, this potentially wastes massive amounts of memory (2k per
task on systems that do not have AVX-512 for instance).

Instead of having a separate slab, this patch just appends the
space that we need to the 'task_struct' which we dynamically
allocate already.  This saves from doing an extra slab allocation
at fork().  The only real downside here is that we have to stick
everything and the end of the task_struct.  But, I think the
BUILD_BUG_ON()s I stuck in there should keep that from being too
fragile.

This survives a quick build and boot in a VM.  Does anyone see any
real downsides to this?

Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
Cc: x...@kernel.org
Cc: Peter Zijlstra pet...@infradead.org
Cc: Borislav Petkov b...@alien8.de
Cc: Andy Lutomirski l...@amacapital.net
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: linux-kernel@vger.kernel.org

---

 b/arch/x86/include/asm/fpu/types.h |   72 +++--
 b/arch/x86/include/asm/processor.h |   10 +++--
 b/arch/x86/kernel/fpu/init.c   |   39 
 b/arch/x86/kernel/process.c|2 -
 b/fs/proc/kcore.c  |4 +-
 b/include/linux/sched.h|   12 +-
 b/kernel/fork.c|8 +++-
 7 files changed, 104 insertions(+), 43 deletions(-)

diff -puN arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu 
arch/x86/include/asm/processor.h
--- a/arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu  
2015-07-16 10:50:42.340570967 -0700
+++ b/arch/x86/include/asm/processor.h  2015-07-16 11:37:19.182539322 -0700
@@ -390,9 +390,6 @@ struct thread_struct {
 #endif
unsigned long   gs;
 
-   /* Floating point and extended processor state */
-   struct fpu  fpu;
-
/* Save middle states of ptrace breakpoints */
struct perf_event   *ptrace_bps[HBP_NUM];
/* Debug status used for traps, single steps, etc... */
@@ -418,6 +415,13 @@ struct thread_struct {
unsigned long   iopl;
/* Max allowed port in the bitmap, in bytes: */
unsignedio_bitmap_max;
+
+   /* Floating point and extended processor state */
+   struct fpu  fpu;
+   /*
+* WARNING: 'fpu' is dynamically-sized.  It *MUST* be at
+* the end.
+*/
 };
 
 /*
diff -puN include/linux/sched.h~dynamically-allocate-struct-fpu 
include/linux/sched.h
--- a/include/linux/sched.h~dynamically-allocate-struct-fpu 2015-07-16 
10:50:42.342571058 -0700
+++ b/include/linux/sched.h 2015-07-16 11:44:10.775174149 -0700
@@ -1522,8 +1522,6 @@ struct task_struct {
 /* hung task detection */
unsigned long last_switch_count;
 #endif
-/* CPU-specific state of this task */
-   struct thread_struct thread;
 /* filesystem information */
struct fs_struct *fs;
 /* open file information */
@@ -1778,8 +1776,18 @@ struct task_struct {
unsigned long   task_state_change;
 #endif
int pagefault_disabled;
+/* CPU-specific state of this task */
+   struct thread_struct thread;
+/*
+ * WARNING: on x86, 'thread_struct' contains a variable-sized
+ * structure.  It *MUST* be at the end of 'task_struct'.
+ *
+ * Do not put anything below here!
+ */
 };
 
+extern int arch_task_struct_size(void);
+
 /* Future-safe accessor for struct task_struct's cpus_allowed. */
 #define tsk_cpus_allowed(tsk) ((tsk)-cpus_allowed)
 
diff -puN arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 
arch/x86/kernel/fpu/init.c
--- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu
2015-07-16 10:50:42.355571648 -0700
+++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700
@@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
 unsigned int xstate_size;
 EXPORT_SYMBOL_GPL(xstate_size);
 
+#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
+   BUILD_BUG_ON((sizeof(TYPE) -\
+   offsetof(TYPE, MEMBER) -\
+   sizeof(((TYPE *)0)-MEMBER))   \
+   0)  \
+
+/*
+ * We append the 'struct fpu' to the task_struct.
+ */
+int __weak arch_task_struct_size(void)
+{
+   int task_size = sizeof(struct task_struct);
+
+   /*
+* Subtract off the static size of the register state.
+* It potentially has a bunch of padding.
+*/
+   task_size -= sizeof(((struct task_struct *)0)-thread.fpu.state);
+
+   /*
+* Add back the dynamically-calculated register state
+* size.
+*/
+   task_size += xstate_size;
+
+   /*
+* We dynamically size 'struct fpu', so we require that
+* it be at the end of 'thread_struct' and that
+* 'thread_struct' be at the end of 'task_struct'.  If
+* you hit a compile error 

Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Andy Lutomirski
On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen d...@sr71.net wrote:

 The FPU rewrite removed the dynamic allocations of 'struct fpu'.
 But, this potentially wastes massive amounts of memory (2k per
 task on systems that do not have AVX-512 for instance).

 Instead of having a separate slab, this patch just appends the
 space that we need to the 'task_struct' which we dynamically
 allocate already.  This saves from doing an extra slab allocation
 at fork().  The only real downside here is that we have to stick
 everything and the end of the task_struct.  But, I think the
 BUILD_BUG_ON()s I stuck in there should keep that from being too
 fragile.

 This survives a quick build and boot in a VM.  Does anyone see any
 real downsides to this?

Looks generally sensible.  Minor nitpicking below.

 +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER)   \
 +   BUILD_BUG_ON((sizeof(TYPE) -\
 +   offsetof(TYPE, MEMBER) -\
 +   sizeof(((TYPE *)0)-MEMBER))   \
 +   0)  \
 +

You could save a bit of typing by using offsetofend here.  Something
along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE,
MEMBER));

  #endif /* _ASM_X86_FPU_H */
 diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 
 arch/x86/kernel/process.c
 --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 
 10:50:42.360571875 -0700
 +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700
 @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist
   */
  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
  {
 -   *dst = *src;
 +   memcpy(dst, src, arch_task_struct_size());

This is actually vaguely performance-critical, which makes me thing
that using some kind of inline or other real way (config macro, ifdef,
etc) to detect whether there's an arch override would be better than a
weak function.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread Peter Zijlstra
On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote:
 +++ b/arch/x86/kernel/fpu/init.c  2015-07-16 12:02:15.284280976 -0700
 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene
  unsigned int xstate_size;
  EXPORT_SYMBOL_GPL(xstate_size);
  
 +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \
 + BUILD_BUG_ON((sizeof(TYPE) -\
 + offsetof(TYPE, MEMBER) -\
 + sizeof(((TYPE *)0)-MEMBER))   \
 + 0)  \
 +
 +/*
 + * We append the 'struct fpu' to the task_struct.
 + */
 +int __weak arch_task_struct_size(void)
 +{
 + int task_size = sizeof(struct task_struct);
 +
 + /*
 +  * Subtract off the static size of the register state.
 +  * It potentially has a bunch of padding.
 +  */
 + task_size -= sizeof(((struct task_struct *)0)-thread.fpu.state);
 +
 + /*
 +  * Add back the dynamically-calculated register state
 +  * size.
 +  */
 + task_size += xstate_size;
 +
 + /*
 +  * We dynamically size 'struct fpu', so we require that
 +  * it be at the end of 'thread_struct' and that
 +  * 'thread_struct' be at the end of 'task_struct'.  If
 +  * you hit a compile error here, check the structure to
 +  * see if something got added to the end.
 +  */
 + CHECK_MEMBER_AT_END_OF(struct fpu, state);
 + CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu);
 + CHECK_MEMBER_AT_END_OF(struct task_struct, thread);
 +
 + return task_size;
 +}

Since you want these invariants true at all times, maybe put the
BUILD_BUG_ON() in generic code instead of x86 specific? That way people
poking at other archs are less likely to accidentally break your stuff.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'

2015-07-16 Thread H. Peter Anvin
On 07/16/2015 12:14 PM, Dave Hansen wrote:
 The FPU rewrite removed the dynamic allocations of 'struct fpu'.
 But, this potentially wastes massive amounts of memory (2k per
 task on systems that do not have AVX-512 for instance).
 
 Instead of having a separate slab, this patch just appends the
 space that we need to the 'task_struct' which we dynamically
 allocate already.  This saves from doing an extra slab allocation
 at fork().  The only real downside here is that we have to stick
 everything and the end of the task_struct.  But, I think the
 BUILD_BUG_ON()s I stuck in there should keep that from being too
 fragile.
 
 This survives a quick build and boot in a VM.  Does anyone see any
 real downsides to this?
 

No.  I have also long advocated for merging task_struct and thread_info
into a common structure and get it off the stack; it would improve
security and avoid weird corner cases in the irqstack handling.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/