Zhouyi Zhou <zhouzho...@gmail.com> writes:
> I think we should avoid torture offline the cpu who do tick timer
> when nohz full is running.

Can you tell us what the bug you're fixing is?

Did you see a crash/oops/hang etc? Or are you just proposing this as
something that would be a good idea?

> Tested on PPC VM of Open Source Lab of Oregon State University.
> The test results show that after the fix, the success rate of
> rcutorture is improved.
> After:
> Successes: 40 Failures: 9
> Before:
> Successes: 38 Failures: 11
>
> I examined the console.log and Make.out files one by one, no new
> compile error or test error is introduced by above fix.
>
> Signed-off-by: Zhouyi Zhou <zhouzho...@gmail.com>
> ---
> Dear PPC developers
>
> I found this bug when trying to do rcutorture tests in ppc VM of
> Open Source Lab of Oregon State University:
>
> ubuntu@ubuntu:~/linux-next/tools/testing/selftests/rcutorture/res/2022.09.30-01.06.22-torture$
>  find . -name "console.log.diags"|xargs grep HOTPLUG
> ./results-scftorture/NOPREEMPT/console.log.diags:WARNING: HOTPLUG FAILURES 
> NOPREEMPT
> ./results-rcutorture/TASKS03/console.log.diags:WARNING: HOTPLUG FAILURES 
> TASKS03
> ./results-rcutorture/TREE04/console.log.diags:WARNING: HOTPLUG FAILURES TREE04
> ./results-scftorture-kasan/NOPREEMPT/console.log.diags:WARNING: HOTPLUG 
> FAILURES NOPREEMPT
> ./results-rcutorture-kasan/TASKS03/console.log.diags:WARNING: HOTPLUG 
> FAILURES TASKS03
> ./results-rcutorture-kasan/TREE04/console.log.diags:WARNING: HOTPLUG FAILURES 
> TREE04
>
> I tried to fix this bug.
>
> Thanks for your patience and guidance ;-)
>
> Thanks 
> Zhouyi
> --
>  arch/powerpc/kernel/sysfs.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index ef9a61718940..be9c0e45337e 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -4,6 +4,7 @@
>  #include <linux/smp.h>
>  #include <linux/percpu.h>
>  #include <linux/init.h>
> +#include <linux/tick.h>
>  #include <linux/sched.h>
>  #include <linux/export.h>
>  #include <linux/nodemask.h>
> @@ -21,6 +22,7 @@
>  #include <asm/firmware.h>
>  #include <asm/idle.h>
>  #include <asm/svm.h>
> +#include "../../../kernel/time/tick-internal.h"
  
Needing to include this internal header is a sign that we are using the
wrong API or otherwise using time keeping internals we shouldn't be.

>  #include "cacheinfo.h"
>  #include "setup.h"
> @@ -1151,7 +1153,11 @@ static int __init topology_init(void)
>                * CPU.  For instance, the boot cpu might never be valid
>                * for hotplugging.
>                */
> -             if (smp_ops && smp_ops->cpu_offline_self)
> +             if (smp_ops && smp_ops->cpu_offline_self
> +#ifdef CONFIG_NO_HZ_FULL
> +                 && !(tick_nohz_full_running && tick_do_timer_cpu == cpu)
> +#endif
> +                 )

I can't see any other arches doing anything like this. I don't think
it's the arches responsibility.

If the time keeping core needs a CPU to stay online to run the timer
then it needs to organise that itself IMHO :)

cheers

>                       c->hotpluggable = 1;
>  #endif
>  
> -- 
> 2.25.1

Reply via email to