On 2018-09-12 08:47:19 [-0700], Andy Lutomirski wrote:
> > --- a/arch/x86/kernel/fpu/core.c
> > +++ b/arch/x86/kernel/fpu/core.c
> > @@ -101,14 +101,14 @@ void __kernel_fpu_begin(void)
> >
> > kernel_fpu_disable();
> >
> > - if (fpu->initialized) {
> > + __cpu_invalidate_fpregs_state();
> > +
> > + if (!test_and_set_thread_flag(TIF_LOAD_FPU)) {
>
> Since the already-TIF_LOAD_FPU path is supposed to be fast here, use
> test_thread_flag() instead. test_and_set operations do unconditional RMW
> operations and are always full barriers, so they’re slow.
okay.
> Also, on top of this patch, there should be lots of cleanups available. In
> particular, all the fpu state accessors could probably be reworked to take
> TIF_LOAD_FPU into account, which would simplify the callers and maybe even
> the mess of variables tracking whether the state is in regs.
Do you refer to the fpu.initilized check or something else?
Sebastian