Re: [PATCH V3 0/2] Fix perf bench numa to work with machines having #CPUs > 1K

2022-04-14 Thread Arnaldo Carvalho de Melo
Em Tue, Apr 12, 2022 at 10:10:57PM +0530, Athira Rajeev escreveu:
> The perf benchmark for collections: numa hits failure in system
> configuration with CPU's more than 1024. These benchmarks uses
> "sched_getaffinity" and "sched_setaffinity" in the code to
> work with affinity.

Thanks, applied.

- Arnaldo

 
> Example snippet from numa benchmark:
> <<>>
> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
> Aborted (core dumped)
> <<>>
> 
> bind_to_node function uses "sched_getaffinity" to save the cpumask.
> This fails with EINVAL because the default mask size in glibc is 1024
> 
> To overcome this 1024 CPUs mask size limitation of cpu_set_t,
> change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to
> allocate cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.
> 
> Fix all the relevant places in the code to use mask size which is large
> enough to represent number of possible CPU's in the system.
> 
> This patchset also address a fix for parse_setup_cpu_list function in
> numa bench to check if input CPU is online before binding task to
> that CPU. This is to fix failures where, though CPU number is within
> max CPU, it could happen that CPU is offline. Here, sched_setaffinity
> will result in failure when using cpumask having that cpu bit set
> in the mask.
> 
> Patch 1 address fix for parse_setup_cpu_list to check if CPU used to bind
> task is online. Patch 2 has fix for bench numa to work with machines
> having #CPUs > 1K
> 
> Athira Rajeev (2):
>   tools/perf: Fix perf bench numa testcase to check if CPU used to bind
> task is online
>   perf bench: Fix numa bench to fix usage of affinity for machines with
> #CPUs > 1K
> 
> Changelog:
> v2 -> v3
> Link to the v2 version :
> https://lore.kernel.org/all/20220406175113.87881-1-atraj...@linux.vnet.ibm.com/
>  - From the v2 version, patch 1 and patch 2 are now part of upstream.
>  - This v3 version separates patch 3 and patch 4 to address review
>comments from arnaldo which includes using sysfs__read_str for reading
>sysfs file and fixing the compilation issues observed in debian
> 
>  tools/perf/bench/numa.c  | 136 +--
>  tools/perf/util/header.c |  51 +++
>  tools/perf/util/header.h |   1 +
>  3 files changed, 153 insertions(+), 35 deletions(-)
> 
> -- 
> 2.35.1

-- 

- Arnaldo


Re: [PATCH V3 0/2] Fix perf bench numa to work with machines having #CPUs > 1K

2022-04-13 Thread Disha Goel


-Original Message-
From: Athira Rajeev 
To: a...@kernel.org, jo...@kernel.org, disg...@linux.vnet.ibm.com
Cc: m...@ellerman.id.au, linux-perf-us...@vger.kernel.org, 
linuxppc-dev@lists.ozlabs.org, ma...@linux.vnet.ibm.com, 
rnsas...@linux.ibm.com, kj...@linux.ibm.com, 
linux-ker...@vger.kernel.org, sri...@linux.vnet.ibm.com, 
irog...@google.com
Subject: [PATCH V3 0/2] Fix perf bench numa to work with machines
having #CPUs > 1K
Date: Tue, 12 Apr 2022 22:10:57 +0530

The perf benchmark for collections: numa hits failure in
systemconfiguration with CPU's more than 1024. These benchmarks
uses"sched_getaffinity" and "sched_setaffinity" in the code towork with
affinity.
Example snippet from numa benchmark:<<>>perf: bench/numa.c:302:
bind_to_node: Assertion `!(ret)' failed.Aborted (core dumped)<<>>
bind_to_node function uses "sched_getaffinity" to save the cpumask.This
fails with EINVAL because the default mask size in glibc is 1024
To overcome this 1024 CPUs mask size limitation of cpu_set_t,change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC toallocate
cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.
Fix all the relevant places in the code to use mask size which is
largeenough to represent number of possible CPU's in the system.
This patchset also address a fix for parse_setup_cpu_list function
innuma bench to check if input CPU is online before binding task tothat
CPU. This is to fix failures where, though CPU number is withinmax CPU,
it could happen that CPU is offline. Here, sched_setaffinitywill result
in failure when using cpumask having that cpu bit setin the mask.
Patch 1 address fix for parse_setup_cpu_list to check if CPU used to
bindtask is online. Patch 2 has fix for bench numa to work with
machineshaving #CPUs > 1K
Athira Rajeev (2):  tools/perf: Fix perf bench numa testcase to check
if CPU used to bindtask is online  perf bench: Fix numa bench to
fix usage of affinity for machines with#CPUs > 1K
Changelog:v2 -> v3Link to the v2 version :
https://lore.kernel.org/all/20220406175113.87881-1-atraj...@linux.vnet.ibm.com/
 - From the v2 version, patch 1 and patch 2 are now part of upstream. -
This v3 version separates patch 3 and patch 4 to address
review   comments from arnaldo which includes using sysfs__read_str for
reading   sysfs file and fixing the compilation issues observed in
debian
Tesed the patches on powerpc with CPU > 1K and other configurations as
well, verified the perf bench numa with the patch set.Tested-by: Disha
Goel 
 tools/perf/bench/numa.c  | 136 +--
tools/perf/util/header.c |  51 +++ tools/perf/util/header.h
|   1 + 3 files changed, 153 insertions(+), 35 deletions(-)



[PATCH V3 0/2] Fix perf bench numa to work with machines having #CPUs > 1K

2022-04-12 Thread Athira Rajeev
The perf benchmark for collections: numa hits failure in system
configuration with CPU's more than 1024. These benchmarks uses
"sched_getaffinity" and "sched_setaffinity" in the code to
work with affinity.

Example snippet from numa benchmark:
<<>>
perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
Aborted (core dumped)
<<>>

bind_to_node function uses "sched_getaffinity" to save the cpumask.
This fails with EINVAL because the default mask size in glibc is 1024

To overcome this 1024 CPUs mask size limitation of cpu_set_t,
change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to
allocate cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.

Fix all the relevant places in the code to use mask size which is large
enough to represent number of possible CPU's in the system.

This patchset also address a fix for parse_setup_cpu_list function in
numa bench to check if input CPU is online before binding task to
that CPU. This is to fix failures where, though CPU number is within
max CPU, it could happen that CPU is offline. Here, sched_setaffinity
will result in failure when using cpumask having that cpu bit set
in the mask.

Patch 1 address fix for parse_setup_cpu_list to check if CPU used to bind
task is online. Patch 2 has fix for bench numa to work with machines
having #CPUs > 1K

Athira Rajeev (2):
  tools/perf: Fix perf bench numa testcase to check if CPU used to bind
task is online
  perf bench: Fix numa bench to fix usage of affinity for machines with
#CPUs > 1K

Changelog:
v2 -> v3
Link to the v2 version :
https://lore.kernel.org/all/20220406175113.87881-1-atraj...@linux.vnet.ibm.com/
 - From the v2 version, patch 1 and patch 2 are now part of upstream.
 - This v3 version separates patch 3 and patch 4 to address review
   comments from arnaldo which includes using sysfs__read_str for reading
   sysfs file and fixing the compilation issues observed in debian

 tools/perf/bench/numa.c  | 136 +--
 tools/perf/util/header.c |  51 +++
 tools/perf/util/header.h |   1 +
 3 files changed, 153 insertions(+), 35 deletions(-)

-- 
2.35.1