Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* H. Peter Anvin wrote: > On 07/16/2015 12:14 PM, Dave Hansen wrote: > > The FPU rewrite removed the dynamic allocations of 'struct fpu'. > > But, this potentially wastes massive amounts of memory (2k per > > task on systems that do not have AVX-512 for instance). > > > > Instead of having a separate slab, this patch just appends the > > space that we need to the 'task_struct' which we dynamically > > allocate already. This saves from doing an extra slab allocation > > at fork(). The only real downside here is that we have to stick > > everything and the end of the task_struct. But, I think the > > BUILD_BUG_ON()s I stuck in there should keep that from being too > > fragile. > > > > This survives a quick build and boot in a VM. Does anyone see any > > real downsides to this? > > No. I have also long advocated for merging task_struct and thread_info into > a > common structure and get it off the stack; it would improve security and > avoid > weird corner cases in the irqstack handling. Note that we have 3 related 'task state' data structures with overlapping purpose: task_struct thread_struct thread_info where thread_struct is embedded in task_struct currently. So to turn it all into a single structure we'd have to merge thread_info into thread_struct. thread_info was put on the kernel stack due to the ESP trick we played long ago - but that is moot these days. So I think what we want is not some common structure, but to actually merge all of thread_info into thread_struct for arch details and into task_struct for generic fields, and only have: task_struct/* generic fields */ thread_struct /* arch details */ this can be done gradually, field by field, and in the end thread_info can be eliminated altogether. The only real complication is that it affects every architecture. The good news is that most of the thread_info layout details are wrapped in various constructs like test_ti_thread_flag() and task_thread_info(). While at it we might as well rename 'thread_struct' to 'arch_task_struct': task_struct /* generic fields */ arch_task_struct /* arch details */ to make it really clear and easy to understand at a glance - as the current naming is has become ambiguous and slightly confusing the moment we introduced threading. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Dave Hansen wrote: > >> #endif /* _ASM_X86_FPU_H */ > >> diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu > >> arch/x86/kernel/process.c > >> --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 > >> 10:50:42.360571875 -0700 > >> +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700 > >> @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist > >> */ > >> int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > >> { > >> - *dst = *src; > >> + memcpy(dst, src, arch_task_struct_size()); > > > > This is actually vaguely performance-critical, which makes me thing > > that using some kind of inline or other real way (config macro, ifdef, > > etc) to detect whether there's an arch override would be better than a > > weak function. > > Fair enough. I'll send out another version in a bit if there are no more > comments. Beyond making it a build time switch for other architectures, I'd also suggest introducing a __read_mostly task_struct_size variable on x86, so that we can write: memcpy(dst, src, task_struct_size); Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Peter Zijlstra wrote: > On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote: > > +++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700 > > @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene > > unsigned int xstate_size; > > EXPORT_SYMBOL_GPL(xstate_size); > > > > +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ > > + BUILD_BUG_ON((sizeof(TYPE) -\ > > + offsetof(TYPE, MEMBER) -\ > > + sizeof(((TYPE *)0)->MEMBER)) > \ > > + 0) \ > > + > > +/* > > + * We append the 'struct fpu' to the task_struct. > > + */ > > +int __weak arch_task_struct_size(void) > > +{ > > + int task_size = sizeof(struct task_struct); > > + > > + /* > > +* Subtract off the static size of the register state. > > +* It potentially has a bunch of padding. > > +*/ > > + task_size -= sizeof(((struct task_struct *)0)->thread.fpu.state); > > + > > + /* > > +* Add back the dynamically-calculated register state > > +* size. > > +*/ > > + task_size += xstate_size; > > + > > + /* > > +* We dynamically size 'struct fpu', so we require that > > +* it be at the end of 'thread_struct' and that > > +* 'thread_struct' be at the end of 'task_struct'. If > > +* you hit a compile error here, check the structure to > > +* see if something got added to the end. > > +*/ > > + CHECK_MEMBER_AT_END_OF(struct fpu, state); > > + CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu); > > + CHECK_MEMBER_AT_END_OF(struct task_struct, thread); > > + > > + return task_size; > > +} > > Since you want these invariants true at all times, maybe put the > BUILD_BUG_ON() in generic code instead of x86 specific? That way people > poking at other archs are less likely to accidentally break your stuff. Yeah. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Dave Hansen wrote: > The FPU rewrite removed the dynamic allocations of 'struct fpu'. > But, this potentially wastes massive amounts of memory (2k per > task on systems that do not have AVX-512 for instance). > > Instead of having a separate slab, this patch just appends the > space that we need to the 'task_struct' which we dynamically > allocate already. This saves from doing an extra slab allocation > at fork(). The only real downside here is that we have to stick > everything and the end of the task_struct. But, I think the > BUILD_BUG_ON()s I stuck in there should keep that from being too > fragile. > > This survives a quick build and boot in a VM. Does anyone see any > real downsides to this? So considering the complexity of the other patch that makes the static allocation, I'd massively prefer this patch as it solves the real bug. It should also work on future hardware a lot better. This was the dynamic approach I suggested in our discussion of the big FPU code rework. > --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu > 2015-07-16 10:50:42.355571648 -0700 > +++ b/arch/x86/kernel/fpu/init.c 2015-07-16 12:02:15.284280976 -0700 > @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene > unsigned int xstate_size; > EXPORT_SYMBOL_GPL(xstate_size); > > +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ > + BUILD_BUG_ON((sizeof(TYPE) -\ > + offsetof(TYPE, MEMBER) -\ > + sizeof(((TYPE *)0)->MEMBER)) > \ > + 0) \ > + > +/* > + * We append the 'struct fpu' to the task_struct. > + */ > +int __weak arch_task_struct_size(void) This should not be __weak, otherwise we risk getting the generic version: > --- a/kernel/fork.c~dynamically-allocate-struct-fpu 2015-07-16 > 10:50:42.357571739 -0700 > +++ b/kernel/fork.c 2015-07-16 11:25:53.873498188 -0700 > @@ -287,15 +287,21 @@ static void set_max_threads(unsigned int > max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS); > } > > +int __weak arch_task_struct_size(void) > +{ > + return sizeof(struct task_struct); > +} > + Your system probably worked due to link order preferring the x86 version but I'm not sure. Other than this bug it looks good to me in principle. Lemme check it on various hardware. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote: +++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene unsigned int xstate_size; EXPORT_SYMBOL_GPL(xstate_size); +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)-MEMBER)) \ + 0) \ + +/* + * We append the 'struct fpu' to the task_struct. + */ +int __weak arch_task_struct_size(void) +{ + int task_size = sizeof(struct task_struct); + + /* +* Subtract off the static size of the register state. +* It potentially has a bunch of padding. +*/ + task_size -= sizeof(((struct task_struct *)0)-thread.fpu.state); + + /* +* Add back the dynamically-calculated register state +* size. +*/ + task_size += xstate_size; + + /* +* We dynamically size 'struct fpu', so we require that +* it be at the end of 'thread_struct' and that +* 'thread_struct' be at the end of 'task_struct'. If +* you hit a compile error here, check the structure to +* see if something got added to the end. +*/ + CHECK_MEMBER_AT_END_OF(struct fpu, state); + CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu); + CHECK_MEMBER_AT_END_OF(struct task_struct, thread); + + return task_size; +} Since you want these invariants true at all times, maybe put the BUILD_BUG_ON() in generic code instead of x86 specific? That way people poking at other archs are less likely to accidentally break your stuff. Yeah. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Dave Hansen d...@sr71.net wrote: #endif /* _ASM_X86_FPU_H */ diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu arch/x86/kernel/process.c --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.360571875 -0700 +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700 @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist */ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) { - *dst = *src; + memcpy(dst, src, arch_task_struct_size()); This is actually vaguely performance-critical, which makes me thing that using some kind of inline or other real way (config macro, ifdef, etc) to detect whether there's an arch override would be better than a weak function. Fair enough. I'll send out another version in a bit if there are no more comments. Beyond making it a build time switch for other architectures, I'd also suggest introducing a __read_mostly task_struct_size variable on x86, so that we can write: memcpy(dst, src, task_struct_size); Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* Dave Hansen d...@sr71.net wrote: The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? So considering the complexity of the other patch that makes the static allocation, I'd massively prefer this patch as it solves the real bug. It should also work on future hardware a lot better. This was the dynamic approach I suggested in our discussion of the big FPU code rework. --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.355571648 -0700 +++ b/arch/x86/kernel/fpu/init.c 2015-07-16 12:02:15.284280976 -0700 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene unsigned int xstate_size; EXPORT_SYMBOL_GPL(xstate_size); +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)-MEMBER)) \ + 0) \ + +/* + * We append the 'struct fpu' to the task_struct. + */ +int __weak arch_task_struct_size(void) This should not be __weak, otherwise we risk getting the generic version: --- a/kernel/fork.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.357571739 -0700 +++ b/kernel/fork.c 2015-07-16 11:25:53.873498188 -0700 @@ -287,15 +287,21 @@ static void set_max_threads(unsigned int max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS); } +int __weak arch_task_struct_size(void) +{ + return sizeof(struct task_struct); +} + Your system probably worked due to link order preferring the x86 version but I'm not sure. Other than this bug it looks good to me in principle. Lemme check it on various hardware. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
* H. Peter Anvin h...@zytor.com wrote: On 07/16/2015 12:14 PM, Dave Hansen wrote: The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? No. I have also long advocated for merging task_struct and thread_info into a common structure and get it off the stack; it would improve security and avoid weird corner cases in the irqstack handling. Note that we have 3 related 'task state' data structures with overlapping purpose: task_struct thread_struct thread_info where thread_struct is embedded in task_struct currently. So to turn it all into a single structure we'd have to merge thread_info into thread_struct. thread_info was put on the kernel stack due to the ESP trick we played long ago - but that is moot these days. So I think what we want is not some common structure, but to actually merge all of thread_info into thread_struct for arch details and into task_struct for generic fields, and only have: task_struct/* generic fields */ thread_struct /* arch details */ this can be done gradually, field by field, and in the end thread_info can be eliminated altogether. The only real complication is that it affects every architecture. The good news is that most of the thread_info layout details are wrapped in various constructs like test_ti_thread_flag() and task_thread_info(). While at it we might as well rename 'thread_struct' to 'arch_task_struct': task_struct /* generic fields */ arch_task_struct /* arch details */ to make it really clear and easy to understand at a glance - as the current naming is has become ambiguous and slightly confusing the moment we introduced threading. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On Thu, Jul 16, 2015 at 3:42 PM, H. Peter Anvin wrote: > On 07/16/2015 12:14 PM, Dave Hansen wrote: >> The FPU rewrite removed the dynamic allocations of 'struct fpu'. >> But, this potentially wastes massive amounts of memory (2k per >> task on systems that do not have AVX-512 for instance). >> >> Instead of having a separate slab, this patch just appends the >> space that we need to the 'task_struct' which we dynamically >> allocate already. This saves from doing an extra slab allocation >> at fork(). The only real downside here is that we have to stick >> everything and the end of the task_struct. But, I think the >> BUILD_BUG_ON()s I stuck in there should keep that from being too >> fragile. >> >> This survives a quick build and boot in a VM. Does anyone see any >> real downsides to this? >> > > No. I have also long advocated for merging task_struct and thread_info > into a common structure and get it off the stack; it would improve > security and avoid weird corner cases in the irqstack handling. > In tip/x86/asm, entry_64.S only references thread_info in two places, both in the syscall code. I'm hoping that we can move that code into C soon and that we can do the same thing to the 32-bit code. If we do that, then we could just move ti->flags into thread_struct. Everything else should be easy, and then thread_info would be empty. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On 07/16/2015 12:14 PM, Dave Hansen wrote: > The FPU rewrite removed the dynamic allocations of 'struct fpu'. > But, this potentially wastes massive amounts of memory (2k per > task on systems that do not have AVX-512 for instance). > > Instead of having a separate slab, this patch just appends the > space that we need to the 'task_struct' which we dynamically > allocate already. This saves from doing an extra slab allocation > at fork(). The only real downside here is that we have to stick > everything and the end of the task_struct. But, I think the > BUILD_BUG_ON()s I stuck in there should keep that from being too > fragile. > > This survives a quick build and boot in a VM. Does anyone see any > real downsides to this? > No. I have also long advocated for merging task_struct and thread_info into a common structure and get it off the stack; it would improve security and avoid weird corner cases in the irqstack handling. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote: > +++ b/arch/x86/kernel/fpu/init.c 2015-07-16 12:02:15.284280976 -0700 > @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene > unsigned int xstate_size; > EXPORT_SYMBOL_GPL(xstate_size); > > +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ > + BUILD_BUG_ON((sizeof(TYPE) -\ > + offsetof(TYPE, MEMBER) -\ > + sizeof(((TYPE *)0)->MEMBER)) > \ > + 0) \ > + > +/* > + * We append the 'struct fpu' to the task_struct. > + */ > +int __weak arch_task_struct_size(void) > +{ > + int task_size = sizeof(struct task_struct); > + > + /* > + * Subtract off the static size of the register state. > + * It potentially has a bunch of padding. > + */ > + task_size -= sizeof(((struct task_struct *)0)->thread.fpu.state); > + > + /* > + * Add back the dynamically-calculated register state > + * size. > + */ > + task_size += xstate_size; > + > + /* > + * We dynamically size 'struct fpu', so we require that > + * it be at the end of 'thread_struct' and that > + * 'thread_struct' be at the end of 'task_struct'. If > + * you hit a compile error here, check the structure to > + * see if something got added to the end. > + */ > + CHECK_MEMBER_AT_END_OF(struct fpu, state); > + CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu); > + CHECK_MEMBER_AT_END_OF(struct task_struct, thread); > + > + return task_size; > +} Since you want these invariants true at all times, maybe put the BUILD_BUG_ON() in generic code instead of x86 specific? That way people poking at other archs are less likely to accidentally break your stuff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On 07/16/2015 12:25 PM, Andy Lutomirski wrote: > On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen wrote: >> +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ >> + BUILD_BUG_ON((sizeof(TYPE) -\ >> + offsetof(TYPE, MEMBER) -\ >> + sizeof(((TYPE *)0)->MEMBER)) > \ >> + 0) \ >> + > > You could save a bit of typing by using offsetofend here. Something > along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE, > MEMBER)); Good point. >> #endif /* _ASM_X86_FPU_H */ >> diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu >> arch/x86/kernel/process.c >> --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 >> 10:50:42.360571875 -0700 >> +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700 >> @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist >> */ >> int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) >> { >> - *dst = *src; >> + memcpy(dst, src, arch_task_struct_size()); > > This is actually vaguely performance-critical, which makes me thing > that using some kind of inline or other real way (config macro, ifdef, > etc) to detect whether there's an arch override would be better than a > weak function. Fair enough. I'll send out another version in a bit if there are no more comments. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen wrote: > > The FPU rewrite removed the dynamic allocations of 'struct fpu'. > But, this potentially wastes massive amounts of memory (2k per > task on systems that do not have AVX-512 for instance). > > Instead of having a separate slab, this patch just appends the > space that we need to the 'task_struct' which we dynamically > allocate already. This saves from doing an extra slab allocation > at fork(). The only real downside here is that we have to stick > everything and the end of the task_struct. But, I think the > BUILD_BUG_ON()s I stuck in there should keep that from being too > fragile. > > This survives a quick build and boot in a VM. Does anyone see any > real downsides to this? Looks generally sensible. Minor nitpicking below. > +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ > + BUILD_BUG_ON((sizeof(TYPE) -\ > + offsetof(TYPE, MEMBER) -\ > + sizeof(((TYPE *)0)->MEMBER)) > \ > + 0) \ > + You could save a bit of typing by using offsetofend here. Something along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE, MEMBER)); > #endif /* _ASM_X86_FPU_H */ > diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu > arch/x86/kernel/process.c > --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 > 10:50:42.360571875 -0700 > +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700 > @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist > */ > int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) > { > - *dst = *src; > + memcpy(dst, src, arch_task_struct_size()); This is actually vaguely performance-critical, which makes me thing that using some kind of inline or other real way (config macro, ifdef, etc) to detect whether there's an arch override would be better than a weak function. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Cc: Peter Zijlstra Cc: Borislav Petkov Cc: Andy Lutomirski Cc: Linus Torvalds Cc: linux-kernel@vger.kernel.org --- b/arch/x86/include/asm/fpu/types.h | 72 +++-- b/arch/x86/include/asm/processor.h | 10 +++-- b/arch/x86/kernel/fpu/init.c | 39 b/arch/x86/kernel/process.c|2 - b/fs/proc/kcore.c |4 +- b/include/linux/sched.h| 12 +- b/kernel/fork.c|8 +++- 7 files changed, 104 insertions(+), 43 deletions(-) diff -puN arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu arch/x86/include/asm/processor.h --- a/arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.340570967 -0700 +++ b/arch/x86/include/asm/processor.h 2015-07-16 11:37:19.182539322 -0700 @@ -390,9 +390,6 @@ struct thread_struct { #endif unsigned long gs; - /* Floating point and extended processor state */ - struct fpu fpu; - /* Save middle states of ptrace breakpoints */ struct perf_event *ptrace_bps[HBP_NUM]; /* Debug status used for traps, single steps, etc... */ @@ -418,6 +415,13 @@ struct thread_struct { unsigned long iopl; /* Max allowed port in the bitmap, in bytes: */ unsignedio_bitmap_max; + + /* Floating point and extended processor state */ + struct fpu fpu; + /* +* WARNING: 'fpu' is dynamically-sized. It *MUST* be at +* the end. +*/ }; /* diff -puN include/linux/sched.h~dynamically-allocate-struct-fpu include/linux/sched.h --- a/include/linux/sched.h~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.342571058 -0700 +++ b/include/linux/sched.h 2015-07-16 11:44:10.775174149 -0700 @@ -1522,8 +1522,6 @@ struct task_struct { /* hung task detection */ unsigned long last_switch_count; #endif -/* CPU-specific state of this task */ - struct thread_struct thread; /* filesystem information */ struct fs_struct *fs; /* open file information */ @@ -1778,8 +1776,18 @@ struct task_struct { unsigned long task_state_change; #endif int pagefault_disabled; +/* CPU-specific state of this task */ + struct thread_struct thread; +/* + * WARNING: on x86, 'thread_struct' contains a variable-sized + * structure. It *MUST* be at the end of 'task_struct'. + * + * Do not put anything below here! + */ }; +extern int arch_task_struct_size(void); + /* Future-safe accessor for struct task_struct's cpus_allowed. */ #define tsk_cpus_allowed(tsk) (&(tsk)->cpus_allowed) diff -puN arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu arch/x86/kernel/fpu/init.c --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.355571648 -0700 +++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene unsigned int xstate_size; EXPORT_SYMBOL_GPL(xstate_size); +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)->MEMBER)) > \ + 0) \ + +/* + * We append the 'struct fpu' to the task_struct. + */ +int __weak arch_task_struct_size(void) +{ + int task_size = sizeof(struct task_struct); + + /* +* Subtract off the static size of the register state. +* It potentially has a bunch of padding. +*/ + task_size -= sizeof(((struct task_struct *)0)->thread.fpu.state); + + /* +* Add back the dynamically-calculated register state +* size. +*/ + task_size += xstate_size; + + /* +* We dynamically size 'struct fpu', so we require that +* it be at the end of 'thread_struct' and that +* 'thread_struct' be at the end of 'task_struct'. If +* you hit a compile error here, check the structure to +* see if something got added to the end. +*/ +
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On Thu, Jul 16, 2015 at 3:42 PM, H. Peter Anvin h...@zytor.com wrote: On 07/16/2015 12:14 PM, Dave Hansen wrote: The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? No. I have also long advocated for merging task_struct and thread_info into a common structure and get it off the stack; it would improve security and avoid weird corner cases in the irqstack handling. In tip/x86/asm, entry_64.S only references thread_info in two places, both in the syscall code. I'm hoping that we can move that code into C soon and that we can do the same thing to the 32-bit code. If we do that, then we could just move ti-flags into thread_struct. Everything else should be easy, and then thread_info would be empty. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On 07/16/2015 12:25 PM, Andy Lutomirski wrote: On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen d...@sr71.net wrote: +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)-MEMBER)) \ + 0) \ + You could save a bit of typing by using offsetofend here. Something along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE, MEMBER)); Good point. #endif /* _ASM_X86_FPU_H */ diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu arch/x86/kernel/process.c --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.360571875 -0700 +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700 @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist */ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) { - *dst = *src; + memcpy(dst, src, arch_task_struct_size()); This is actually vaguely performance-critical, which makes me thing that using some kind of inline or other real way (config macro, ifdef, etc) to detect whether there's an arch override would be better than a weak function. Fair enough. I'll send out another version in a bit if there are no more comments. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Borislav Petkov b...@alien8.de Cc: Andy Lutomirski l...@amacapital.net Cc: Linus Torvalds torva...@linux-foundation.org Cc: linux-kernel@vger.kernel.org --- b/arch/x86/include/asm/fpu/types.h | 72 +++-- b/arch/x86/include/asm/processor.h | 10 +++-- b/arch/x86/kernel/fpu/init.c | 39 b/arch/x86/kernel/process.c|2 - b/fs/proc/kcore.c |4 +- b/include/linux/sched.h| 12 +- b/kernel/fork.c|8 +++- 7 files changed, 104 insertions(+), 43 deletions(-) diff -puN arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu arch/x86/include/asm/processor.h --- a/arch/x86/include/asm/processor.h~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.340570967 -0700 +++ b/arch/x86/include/asm/processor.h 2015-07-16 11:37:19.182539322 -0700 @@ -390,9 +390,6 @@ struct thread_struct { #endif unsigned long gs; - /* Floating point and extended processor state */ - struct fpu fpu; - /* Save middle states of ptrace breakpoints */ struct perf_event *ptrace_bps[HBP_NUM]; /* Debug status used for traps, single steps, etc... */ @@ -418,6 +415,13 @@ struct thread_struct { unsigned long iopl; /* Max allowed port in the bitmap, in bytes: */ unsignedio_bitmap_max; + + /* Floating point and extended processor state */ + struct fpu fpu; + /* +* WARNING: 'fpu' is dynamically-sized. It *MUST* be at +* the end. +*/ }; /* diff -puN include/linux/sched.h~dynamically-allocate-struct-fpu include/linux/sched.h --- a/include/linux/sched.h~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.342571058 -0700 +++ b/include/linux/sched.h 2015-07-16 11:44:10.775174149 -0700 @@ -1522,8 +1522,6 @@ struct task_struct { /* hung task detection */ unsigned long last_switch_count; #endif -/* CPU-specific state of this task */ - struct thread_struct thread; /* filesystem information */ struct fs_struct *fs; /* open file information */ @@ -1778,8 +1776,18 @@ struct task_struct { unsigned long task_state_change; #endif int pagefault_disabled; +/* CPU-specific state of this task */ + struct thread_struct thread; +/* + * WARNING: on x86, 'thread_struct' contains a variable-sized + * structure. It *MUST* be at the end of 'task_struct'. + * + * Do not put anything below here! + */ }; +extern int arch_task_struct_size(void); + /* Future-safe accessor for struct task_struct's cpus_allowed. */ #define tsk_cpus_allowed(tsk) ((tsk)-cpus_allowed) diff -puN arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu arch/x86/kernel/fpu/init.c --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.355571648 -0700 +++ b/arch/x86/kernel/fpu/init.c2015-07-16 12:02:15.284280976 -0700 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene unsigned int xstate_size; EXPORT_SYMBOL_GPL(xstate_size); +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)-MEMBER)) \ + 0) \ + +/* + * We append the 'struct fpu' to the task_struct. + */ +int __weak arch_task_struct_size(void) +{ + int task_size = sizeof(struct task_struct); + + /* +* Subtract off the static size of the register state. +* It potentially has a bunch of padding. +*/ + task_size -= sizeof(((struct task_struct *)0)-thread.fpu.state); + + /* +* Add back the dynamically-calculated register state +* size. +*/ + task_size += xstate_size; + + /* +* We dynamically size 'struct fpu', so we require that +* it be at the end of 'thread_struct' and that +* 'thread_struct' be at the end of 'task_struct'. If +* you hit a compile error
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On Thu, Jul 16, 2015 at 12:14 PM, Dave Hansen d...@sr71.net wrote: The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? Looks generally sensible. Minor nitpicking below. +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)-MEMBER)) \ + 0) \ + You could save a bit of typing by using offsetofend here. Something along the lines of BUILD_BUG_ON(sizeof(TYPE) != offsetofend(TYPE, MEMBER)); #endif /* _ASM_X86_FPU_H */ diff -puN arch/x86/kernel/process.c~dynamically-allocate-struct-fpu arch/x86/kernel/process.c --- a/arch/x86/kernel/process.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.360571875 -0700 +++ b/arch/x86/kernel/process.c 2015-07-16 12:00:59.204808551 -0700 @@ -81,7 +81,7 @@ EXPORT_SYMBOL_GPL(idle_notifier_unregist */ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) { - *dst = *src; + memcpy(dst, src, arch_task_struct_size()); This is actually vaguely performance-critical, which makes me thing that using some kind of inline or other real way (config macro, ifdef, etc) to detect whether there's an arch override would be better than a weak function. --Andy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On Thu, Jul 16, 2015 at 12:14:37PM -0700, Dave Hansen wrote: +++ b/arch/x86/kernel/fpu/init.c 2015-07-16 12:02:15.284280976 -0700 @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene unsigned int xstate_size; EXPORT_SYMBOL_GPL(xstate_size); +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ + BUILD_BUG_ON((sizeof(TYPE) -\ + offsetof(TYPE, MEMBER) -\ + sizeof(((TYPE *)0)-MEMBER)) \ + 0) \ + +/* + * We append the 'struct fpu' to the task_struct. + */ +int __weak arch_task_struct_size(void) +{ + int task_size = sizeof(struct task_struct); + + /* + * Subtract off the static size of the register state. + * It potentially has a bunch of padding. + */ + task_size -= sizeof(((struct task_struct *)0)-thread.fpu.state); + + /* + * Add back the dynamically-calculated register state + * size. + */ + task_size += xstate_size; + + /* + * We dynamically size 'struct fpu', so we require that + * it be at the end of 'thread_struct' and that + * 'thread_struct' be at the end of 'task_struct'. If + * you hit a compile error here, check the structure to + * see if something got added to the end. + */ + CHECK_MEMBER_AT_END_OF(struct fpu, state); + CHECK_MEMBER_AT_END_OF(struct thread_struct, fpu); + CHECK_MEMBER_AT_END_OF(struct task_struct, thread); + + return task_size; +} Since you want these invariants true at all times, maybe put the BUILD_BUG_ON() in generic code instead of x86 specific? That way people poking at other archs are less likely to accidentally break your stuff. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu'
On 07/16/2015 12:14 PM, Dave Hansen wrote: The FPU rewrite removed the dynamic allocations of 'struct fpu'. But, this potentially wastes massive amounts of memory (2k per task on systems that do not have AVX-512 for instance). Instead of having a separate slab, this patch just appends the space that we need to the 'task_struct' which we dynamically allocate already. This saves from doing an extra slab allocation at fork(). The only real downside here is that we have to stick everything and the end of the task_struct. But, I think the BUILD_BUG_ON()s I stuck in there should keep that from being too fragile. This survives a quick build and boot in a VM. Does anyone see any real downsides to this? No. I have also long advocated for merging task_struct and thread_info into a common structure and get it off the stack; it would improve security and avoid weird corner cases in the irqstack handling. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/