Re: [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks

2019-10-17 Thread Dave Martin
On Thu, Oct 17, 2019 at 01:42:37PM +0100, Suzuki K Poulose wrote:
> Hi Dave
> 
> Thanks for the comments.
> 
> On 11/10/2019 12:26, Dave Martin wrote:
> >On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote:
> >>We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
> >>we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
> >>could be inherited by early userspace applications (e.g, modprobe triggered
> >>from initramfs). This could end up in the applications stuck in
> >>do_nofity_resume() as we never clear the TIF flag, once we now know that
> >>we don't support FP.
> >>
> >>Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
> >>for tasks which may have them set, as we would have done in the normal
> >>case, but avoiding touching the hardware state (since we don't support any).
> >>
> >>Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>Cc: Will Deacon 
> >>Cc: Mark Rutland 
> >>Cc: Catalin Marinas 
> >>Signed-off-by: Suzuki K Poulose 
> >>---
> >>  arch/arm64/kernel/fpsimd.c | 26 --
> >>  1 file changed, 16 insertions(+), 10 deletions(-)
> >>
> >>diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> >>index 37d3912cfe06..dfcdd077aeca 100644
> >>--- a/arch/arm64/kernel/fpsimd.c
> >>+++ b/arch/arm64/kernel/fpsimd.c
> >>@@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct 
> >>user_fpsimd_state *st, void *sve_state,
> >>   */
> >>  void fpsimd_restore_current_state(void)
> >>  {
> >>-   if (!system_supports_fpsimd())
> >>-   return;
> >>-
> >>get_cpu_fpsimd_context();
> >>-
> >>-   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
> >>+   /*
> >>+* For the tasks that were created before we detected the absence of
> >>+* FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via 
> >>fpsimd_thread_switch()
> >>+* and/or could be inherited from the parent(init_task has this set). 
> >>Even
> >>+* though userspace has not run yet, this could be inherited by the
> >>+* processes forked from one of those tasks (e.g, modprobe from 
> >>initramfs).
> >>+* If the system doesn't support FP/SIMD, we must clear the flag for the
> >>+* tasks mentioned above, to indicate that the FPSTATE is clean (as we
> >>+* can't have one) to avoid looping for ever to clear the flag.
> >>+*/
> >>+   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
> >>+   system_supports_fpsimd()) {
> >
> >I'm not too keen on this approach: elsewhere we just stub out all the
> >FPSIMD handling logic if !system_supports_fpsimd() -- I think we should
> >be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE.
> 
> We used to do this. But the flag is not cleared anymore once we detect
> the absence of FP/SIMD.
> 
> >Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task
> >and this task is current() and the system supports FPSIMD at all, this
> >task's FPSIMD state is not loaded in the cpu".
> 
> Yes, that is  correct. However, we ran some tasks, even before we detected
> that the FPSIMD is missing. So, we need to clear the flag for those tasks
> to make sure the flag state is consistent, as explained in the comment.

I think there's a misunderstanding here somewhere.

What I'm saying it that we shouldn't even look at TIF_FOREIGN_FPSTATE if
!system_supports_fpsimd() -- i.e., when checking whether there is
any FPSIMD context handling work to do, !system_supports_fpsimd()
should take precedence.

Firstly, this replaces the runtime TIF_FOREIGN_FPSTATE check with a
static key check in the !system_supprts_fpsimd() case, and second, the
"work to do" condition is never wrong, even temporarily.

The "work to do" condition is now

system_supports_fpsimd() && test_thread_flag(TIF_FOREIGN_FPSTATE)

instead of

test_thread_flag(TIF_FOREIGN_FPSTATE).

Code that _only writes_ the TIF_FORGIEN_FPSTATE flag can continue to do
so harmlessly if we do things this way.

In do_notify_resume() this doesn't quite work, but it's acceptable to
fall spuriously into fpsimd_restore_current_state() provided that we
check for !system_supports_fpsimd() in there.  Which we already do.
In this one case, we should clear TIF_FOREIGN_FPSTATE so this backwards
checking doesn't send do_notify_resume() into a spin waiting for the
flag to go clear.

Another option is to clear TIF_FOREIGN_FPSTATE from _TIF_WORK_MASK
when checking for pending work in do_notify_resume(), so that we
effectively ignore TIF_FOREIGN_FPSTATE here too.  This would be a static
key based check again, so should be fast.

I think this is a cleaner approach than trying to clean up
TIF_FOREIGN_FPSTATE for each task lazily in some random places --
otherwise we may need to do the cleanup anywhere that the flag is
accessed, and that happens all over the place.

Does that make sense?  More below.


> Another option is to clear this flag in fpsimd_thread_switch(), something 
> like,
> rather 

Re: [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks

2019-10-17 Thread Suzuki K Poulose

Hi Dave

Thanks for the comments.

On 11/10/2019 12:26, Dave Martin wrote:

On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote:

We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
could be inherited by early userspace applications (e.g, modprobe triggered
from initramfs). This could end up in the applications stuck in
do_nofity_resume() as we never clear the TIF flag, once we now know that
we don't support FP.

Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
for tasks which may have them set, as we would have done in the normal
case, but avoiding touching the hardware state (since we don't support any).

Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
Cc: Will Deacon 
Cc: Mark Rutland 
Cc: Catalin Marinas 
Signed-off-by: Suzuki K Poulose 
---
  arch/arm64/kernel/fpsimd.c | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 37d3912cfe06..dfcdd077aeca 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state 
*st, void *sve_state,
   */
  void fpsimd_restore_current_state(void)
  {
-   if (!system_supports_fpsimd())
-   return;
-
get_cpu_fpsimd_context();
-
-   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
+   /*
+* For the tasks that were created before we detected the absence of
+* FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via 
fpsimd_thread_switch()
+* and/or could be inherited from the parent(init_task has this set). 
Even
+* though userspace has not run yet, this could be inherited by the
+* processes forked from one of those tasks (e.g, modprobe from 
initramfs).
+* If the system doesn't support FP/SIMD, we must clear the flag for the
+* tasks mentioned above, to indicate that the FPSTATE is clean (as we
+* can't have one) to avoid looping for ever to clear the flag.
+*/
+   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
+   system_supports_fpsimd()) {


I'm not too keen on this approach: elsewhere we just stub out all the
FPSIMD handling logic if !system_supports_fpsimd() -- I think we should
be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE.


We used to do this. But the flag is not cleared anymore once we detect
the absence of FP/SIMD.


Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task
and this task is current() and the system supports FPSIMD at all, this
task's FPSIMD state is not loaded in the cpu".


Yes, that is  correct. However, we ran some tasks, even before we detected
that the FPSIMD is missing. So, we need to clear the flag for those tasks
to make sure the flag state is consistent, as explained in the comment.

Another option is to clear this flag in fpsimd_thread_switch(), something like,
rather than sprinkling the "flag fixup" everywhere:

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index dfcdd077aeca..2d8091b6ebfb 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -982,9 +982,14 @@ void fpsimd_thread_switch(struct task_struct *next)
 {
bool wrong_task, wrong_cpu;

-   if (!system_supports_fpsimd())
+   if (!system_supports_fpsimd()) {
+   /*
+* Clear any TIF flags which may have been set, before we
+* detected the absense of FPSIMD.
+*/
+   clear_task_thread_flag(next, TIF_FOREIGN_FPSTATE);
return;
-
+   }
__get_cpu_fpsimd_context();


And also at :

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index a47462def04b..cd8e94d5dc8d 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -374,7 +374,10 @@ int copy_thread(unsigned long clone_flags, unsigned long 
stack_start,

 * Otherwise we could erroneously skip reloading the FPSIMD
 * registers for p.
 */
-   fpsimd_flush_task_state(p);
+   if (system_supports_fpsimd())
+   fpsimd_flush_task_state(p);
+   else
+   clear_tsk_thread_flag(p, TIF_FOREIGN_FPSTATE);

if (likely(!(p->flags & PF_KTHREAD))) {
*childregs = *current_pt_regs();


That way we make sure the flag doesn't violate our assumption and we can
bail out calling into the stubs with the existing checks.



I think we should ensure that any check on TIF_FOREIGN_FPSTATE is
shadowed by a check on system_supports_fpsimd() somewhere.  This already
exists in many places -- we just need to fill in the missing ones.

fpsimd_save() is a backend function that should only be called if
system_supports_fpsimd(), so that should not need any check internally,
but we should make 

Re: [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks

2019-10-11 Thread Dave Martin
On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote:
> We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
> we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
> could be inherited by early userspace applications (e.g, modprobe triggered
> from initramfs). This could end up in the applications stuck in
> do_nofity_resume() as we never clear the TIF flag, once we now know that
> we don't support FP.
> 
> Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
> for tasks which may have them set, as we would have done in the normal
> case, but avoiding touching the hardware state (since we don't support any).
> 
> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> Cc: Will Deacon 
> Cc: Mark Rutland 
> Cc: Catalin Marinas 
> Signed-off-by: Suzuki K Poulose 
> ---
>  arch/arm64/kernel/fpsimd.c | 26 --
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 37d3912cfe06..dfcdd077aeca 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct 
> user_fpsimd_state *st, void *sve_state,
>   */
>  void fpsimd_restore_current_state(void)
>  {
> - if (!system_supports_fpsimd())
> - return;
> -
>   get_cpu_fpsimd_context();
> -
> - if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
> + /*
> +  * For the tasks that were created before we detected the absence of
> +  * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via 
> fpsimd_thread_switch()
> +  * and/or could be inherited from the parent(init_task has this set). 
> Even
> +  * though userspace has not run yet, this could be inherited by the
> +  * processes forked from one of those tasks (e.g, modprobe from 
> initramfs).
> +  * If the system doesn't support FP/SIMD, we must clear the flag for the
> +  * tasks mentioned above, to indicate that the FPSTATE is clean (as we
> +  * can't have one) to avoid looping for ever to clear the flag.
> +  */
> + if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
> + system_supports_fpsimd()) {

I'm not too keen on this approach: elsewhere we just stub out all the
FPSIMD handling logic if !system_supports_fpsimd() -- I think we should
be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE.

Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task
and this task is current() and the system supports FPSIMD at all, this
task's FPSIMD state is not loaded in the cpu".

I think we should ensure that any check on TIF_FOREIGN_FPSTATE is
shadowed by a check on system_supports_fpsimd() somewhere.  This already
exists in many places -- we just need to fill in the missing ones.

fpsimd_save() is a backend function that should only be called if
system_supports_fpsimd(), so that should not need any check internally,
but we should make sure that calls to this function are appropriately
protected with in if (system_supports_fpsimd()).

For other maintenance functions intended for outside callers:

 * fpsimd_bind_task_to_cpu()
 * fpsimd_bind_state_to_cpu()
 * fpsimd_flush_task_state()
 * fpsimd_save_and_flush_cpu_state()

the situation is less clear.  Does is make sense to call these at all
if !system_supports_fpsimd()?  I'm not currently sure.  We could at
least drop some WARN_ON() into these to check, after revieweing their
callsites.

>   task_fpsimd_load();
>   fpsimd_bind_task_to_cpu();
>   }
> @@ -1148,17 +1155,16 @@ void fpsimd_restore_current_state(void)
>   */
>  void fpsimd_update_current_state(struct user_fpsimd_state const *state)
>  {
> - if (!system_supports_fpsimd())
> - return;
> -
>   get_cpu_fpsimd_context();
>  
>   current->thread.uw.fpsimd_state = *state;
>   if (system_supports_sve() && test_thread_flag(TIF_SVE))
>   fpsimd_to_sve(current);

Why should we do this stuff on a system that doesn't support FP?

> - task_fpsimd_load();
> - fpsimd_bind_task_to_cpu();
> + if (system_supports_fpsimd()) {
> + task_fpsimd_load();
> + fpsimd_bind_task_to_cpu();
> + }
>  
>   clear_thread_flag(TIF_FOREIGN_FPSTATE);

[...]

Not in scope for a stable fix, but:

It would be interesting to try to strip out TIF_FOREIGN_FPSTATE
entirely and do some benchmarks and irq latency measurements:

TIF_FOREIGN_FPSTATE is just a cached copy of the wrong_task || wrong_cpu
condition defined in fpsimd_thread_switch() --

That means we have to do maintenance on it all over the place to keep
it in sync with the condition it represents -- this has proven to be
a source of complexity and subtle bugs, as well as making the code
fragile to maintain.

The only point of all this is so that there is a thread flag for
do_notify_resume() to check.  Now that do_notify_resume() 

[PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks

2019-10-10 Thread Suzuki K Poulose
We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
could be inherited by early userspace applications (e.g, modprobe triggered
from initramfs). This could end up in the applications stuck in
do_nofity_resume() as we never clear the TIF flag, once we now know that
we don't support FP.

Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
for tasks which may have them set, as we would have done in the normal
case, but avoiding touching the hardware state (since we don't support any).

Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
Cc: Will Deacon 
Cc: Mark Rutland 
Cc: Catalin Marinas 
Signed-off-by: Suzuki K Poulose 
---
 arch/arm64/kernel/fpsimd.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 37d3912cfe06..dfcdd077aeca 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state 
*st, void *sve_state,
  */
 void fpsimd_restore_current_state(void)
 {
-   if (!system_supports_fpsimd())
-   return;
-
get_cpu_fpsimd_context();
-
-   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
+   /*
+* For the tasks that were created before we detected the absence of
+* FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via 
fpsimd_thread_switch()
+* and/or could be inherited from the parent(init_task has this set). 
Even
+* though userspace has not run yet, this could be inherited by the
+* processes forked from one of those tasks (e.g, modprobe from 
initramfs).
+* If the system doesn't support FP/SIMD, we must clear the flag for the
+* tasks mentioned above, to indicate that the FPSTATE is clean (as we
+* can't have one) to avoid looping for ever to clear the flag.
+*/
+   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
+   system_supports_fpsimd()) {
task_fpsimd_load();
fpsimd_bind_task_to_cpu();
}
@@ -1148,17 +1155,16 @@ void fpsimd_restore_current_state(void)
  */
 void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 {
-   if (!system_supports_fpsimd())
-   return;
-
get_cpu_fpsimd_context();
 
current->thread.uw.fpsimd_state = *state;
if (system_supports_sve() && test_thread_flag(TIF_SVE))
fpsimd_to_sve(current);
 
-   task_fpsimd_load();
-   fpsimd_bind_task_to_cpu();
+   if (system_supports_fpsimd()) {
+   task_fpsimd_load();
+   fpsimd_bind_task_to_cpu();
+   }
 
clear_thread_flag(TIF_FOREIGN_FPSTATE);
 
-- 
2.21.0