[PATCH V2 2/2] powerpc/perf: Fix the power10 event alternatives array to have correct sort order

2022-04-18 Thread Athira Rajeev
e alternative event array to be sorted by column 0 for power10-pmu.c Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support") Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan --- Changelog: Added Fixes tag and reworded commit message Added Review

[PATCH V2 1/2] powerpc/perf: Fix the power9 event alternatives array to have correct sort order

2022-04-18 Thread Athira Rajeev
e alternative event array to be sorted by column 0 for power9-pmu.c Fixes: 91e0bd1e6251 ("powerpc/perf: Add PM_LD_MISS_L1 and PM_BR_2PATH to power9 event list") Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan --- Changelog: Added Fixes tag and reworded commit message. Added

[PATCH V3 2/2] perf bench: Fix numa bench to fix usage of affinity for machines with #CPUs > 1K

2022-04-12 Thread Athira Rajeev
c. Also fix "sched_setaffinity" to use mask size which is large enough to represent number of possible CPU's in the system. Fixed all places where "bind_cpumask" which is part of "struct thread_data" is used such that bind_cpumask works in all configuration. Reported-

[PATCH V3 1/2] tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online

2022-04-12 Thread Athira Rajeev
e online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Tested-by: Disha Goel Signed-off-by: Athira

[PATCH V3 0/2] Fix perf bench numa to work with machines having #CPUs > 1K

2022-04-12 Thread Athira Rajeev
ing that cpu bit set in the mask. Patch 1 address fix for parse_setup_cpu_list to check if CPU used to bind task is online. Patch 2 has fix for bench numa to work with machines having #CPUs > 1K Athira Rajeev (2): tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is onl

Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-12 Thread Athira Rajeev
> On 09-Apr-2022, at 10:48 PM, Arnaldo Carvalho de Melo wrote: > > Em Sat, Apr 09, 2022 at 12:28:01PM -0300, Arnaldo Carvalho de Melo escreveu: >> Em Wed, Apr 06, 2022 at 11:21:09PM +0530, Athira Rajeev escreveu: >>> The perf benchmark for collections: numa, futex an

Re: [PATCH v2 4/4] tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online

2022-04-11 Thread Athira Rajeev
> On 09-Apr-2022, at 8:50 PM, Arnaldo Carvalho de Melo wrote: > > Em Wed, Apr 06, 2022 at 11:21:13PM +0530, Athira Rajeev escreveu: >> Perf numa bench test fails with error: >> >> Testcase: >> ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZ

Re: [PATCH v2 4/4] tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online

2022-04-09 Thread Athira Rajeev
> On 08-Apr-2022, at 5:56 PM, Srikar Dronamraju > wrote: > > * Athira Rajeev [2022-04-06 23:21:13]: > >> Perf numa bench test fails with error: >> >> Testcase: >> ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq >> --thp

Re: [PATCH V3] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

2022-04-09 Thread Athira Rajeev
> On 09-Apr-2022, at 12:00 AM, Shuah Khan wrote: > > On 4/8/22 1:24 AM, Athira Rajeev wrote: >> The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate >> CPU set. This cpu set is used further in pthread_attr_setaffinity_np >> and by pthread_cr

Re: [PATCH V2] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

2022-04-08 Thread Athira Rajeev
> On 08-Apr-2022, at 12:31 AM, Shuah Khan wrote: > > On 4/7/22 12:40 PM, Athira Rajeev wrote: >> The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate >> CPU set. This cpu set is used further in pthread_attr_setaffinity_np >> and by pthread_cr

[PATCH V3] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

2022-04-08 Thread Athira Rajeev
tion which is called in most of the error/exit path for the cleanup. There are few error paths which exit without using shutdown. Add a common goto error path with CPU_FREE for these cases. Fixes: 7820b0715b6f ("tools/selftests: add mq_perf_tests") Signed-off-by: Athira Rajeev

[PATCH V2] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

2022-04-07 Thread Athira Rajeev
tion which is called in most of the error/exit path for the cleanup. Also add CPU_FREE in some of the error paths where shutdown is not called. Fixes: 7820b0715b6f ("tools/selftests: add mq_perf_tests") Signed-off-by: Athira Rajeev --- Changelog: >From v1 -> v2: Addressed r

Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-06 Thread Athira Rajeev
> On 07-Apr-2022, at 6:05 AM, Ian Rogers wrote: > > On Wed, Apr 6, 2022 at 10:51 AM Athira Rajeev > wrote: >> >> The perf benchmark for collections: numa, futex and epoll >> hits failure in system configuration with CPU's more than 1024. >> Thes

Re: [PATCH] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

2022-04-06 Thread Athira Rajeev
> On 07-Apr-2022, at 1:35 AM, Shuah Khan wrote: > > On 4/6/22 11:57 AM, Athira Rajeev wrote: >> The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate >> CPU set. This cpu set is used further in pthread_attr_setaffinity_np >> and by pthread_cr

[PATCH] testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set

2022-04-06 Thread Athira Rajeev
The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate CPU set. This cpu set is used further in pthread_attr_setaffinity_np and by pthread_create in the code. But in current code, allocated cpu set is not freed. Fix this by adding CPU_FREE after its usage is done. Signed-off-

[PATCH v2 4/4] tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online

2022-04-06 Thread Athira Rajeev
e online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Tested-by: Disha Goel Signed-off-by: At

[PATCH v2 3/4] tools/perf: Fix perf numa bench to fix usage of affinity for machines with #CPUs > 1K

2022-04-06 Thread Athira Rajeev
o fix "sched_setaffinity" to use mask size which is large enough to represent number of possible CPU's in the system. Fixed all places where "bind_cpumask" which is part of "struct thread_data" is used such that bind_cpumask works in all configuration. Tested-by: Disha

[PATCH v2 2/4] tools/perf: Fix perf bench epoll to correct usage of affinity for machines with #CPUs > 1K

2022-04-06 Thread Athira Rajeev
size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Tested-by: Disha

[PATCH v2 1/4] tools/perf: Fix perf bench futex to correct usage of affinity for machines with #CPUs > 1K

2022-04-06 Thread Athira Rajeev
ibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Tested-by: Disha Goel Sign

[PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-06 Thread Athira Rajeev
ix failures where, though CPU number is within max CPU, it could happen that CPU is offline. Here, sched_setaffinity will result in failure when using cpumask having that cpu bit set in the mask. Patch 1 and Patch 2 address fix for perf bench futex and perf bench epoll benchmark. Patch 3 and Patch 4 a

[PATCH] powerpc/perf: Fix the event alternatives array to have correct sort order

2022-04-06 Thread Athira Rajeev
ork with existing logic, fix the alternative event array to be sorted by column 0 for power9-pmu.c and power10-pmu.c Signed-off-by: Athira Rajeev --- arch/powerpc/perf/power10-pmu.c | 2 +- arch/powerpc/perf/power9-pmu.c | 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --

Re: [PATCH 2/4] tools/perf: Fix perf bench epoll to correct usage of affinity for machines with #CPUs > 1K

2022-04-06 Thread Athira Rajeev
> On 05-Apr-2022, at 11:26 PM, Ian Rogers wrote: > > On Fri, Apr 1, 2022 at 12:00 PM Athira Rajeev > wrote: >> >> perf bench epoll testcase fails on systems with CPU's >> more than 1K. >> >> Testcase: perf bench epoll all >> Result snippet: &

Re: [PATCH 3/4] tools/perf: Fix perf numa bench to fix usage of affinity for machines with #CPUs > 1K

2022-04-06 Thread Athira Rajeev
> On 05-Apr-2022, at 11:22 PM, Ian Rogers wrote: > > On Fri, Apr 1, 2022 at 11:59 AM Athira Rajeev > wrote: >> >> perf bench numa testcase fails on systems with CPU's >> more than 1K. >> >> Testcase: perf bench numa mem -p 1 -t 3 -P 512

[PATCH 4/4] tools/perf: Fix perf bench numa testcase to check if CPU used to bind task is online

2022-04-01 Thread Athira Rajeev
e online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Reported-by: Nageswara R

[PATCH 3/4] tools/perf: Fix perf numa bench to fix usage of affinity for machines with #CPUs > 1K

2022-04-01 Thread Athira Rajeev
o fix "sched_setaffinity" to use mask size which is large enough to represent number of possible CPU's in the system. Fixed all places where "bind_cpumask" which is part of "struct thread_data" is used such that bind_cpumask works in all configurati

[PATCH 2/4] tools/perf: Fix perf bench epoll to correct usage of affinity for machines with #CPUs > 1K

2022-04-01 Thread Athira Rajeev
size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Reported-by: Disha

[PATCH 1/4] tools/perf: Fix perf bench futex to correct usage of affinity for machines with #CPUs > 1K

2022-04-01 Thread Athira Rajeev
ibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Reported-by: Disha

[PATCH 0/4] tools/perf: Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-01 Thread Athira Rajeev
ix failures where, though CPU number is within max CPU, it could happen that CPU is offline. Here, sched_setaffinity will result in failure when using cpumask having that cpu bit set in the mask. Patch 1 and Patch 2 address fix for perf bench futex and perf bench epoll benchmark. Patch 3 and Patch 4 a

[PATCH V2] powerpc/perf: Fix task context setting for trace imc

2022-02-01 Thread Athira Rajeev
sw_context inorder to be able to do application level monitoring. Hence change the task_ctx_nr to use perf_sw_context. Fixes: 012ae244845f ("powerpc/perf: Trace imc PMU functions") Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan --- Changelog: v1 -> v2: Added comment i

[PATCH V2] powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending

2022-01-21 Thread Athira Rajeev
overflown while code is in power_pmu_disable callback function. Hence add a check to see if PMI pending bit is set in Paca before clearing it via clear_pmi_pending. Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") Signed-o

Re: [PATCH] powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending

2022-01-18 Thread Athira Rajeev
lear >> PMI pending bit in Paca when disabling the PMU. It could happen >> that PMC gets overflown while code is in power_pmu_disable >> callback function. Hence add a check to see if PMI pending bit >> is set in Paca before clearing it via clear_pmi_pending. >> >

[PATCH] powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending

2022-01-14 Thread Athira Rajeev
overflown while code is in power_pmu_disable callback function. Hence add a check to see if PMI pending bit is set in Paca before clearing it via clear_pmi_pending. Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") Signed-o

[PATCH] powerpc/perf: Fix power_pmu_wants_prompt_pmi to be defined only for CONFIG_PPC64

2022-01-13 Thread Athira Rajeev
Is to be soft-NMI") Signed-off-by: Athira Rajeev Reviewed-by: Nicholas Piggin Reported-by: kernel test robot --- Note: Address compilation warning reported here: https://lore.kernel.org/lkml/202112220137.x16y07dp-...@intel.com/ Patch is based on powerpc/merge arch/powerpc/perf/core-book3s.c | 2

Re: [powerpc/merge] PMU: Kernel warning while running pmu/ebb selftests

2022-01-11 Thread Athira Rajeev
c661c87c467ba2dc2fdb895f5e8dc5018e Mon Sep 17 00:00:00 2001 From: Athira Rajeev Date: Wed, 12 Jan 2022 08:34:40 +0530 Subject: [PATCH] powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending Running selftest with CONFIG_PPC_IRQ_SOFT_MASK_DEBUG enabled in kernel trig

Re: [PATCH V2 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key

2022-01-06 Thread Athira Rajeev
> On 08-Dec-2021, at 9:21 AM, Nageswara Sastry wrote: > > > > On 07/12/21 8:22 pm, Arnaldo Carvalho de Melo wrote: >> Em Fri, Dec 03, 2021 at 07:50:37AM +0530, Athira Rajeev escreveu: >>> Sort key p_stage_cyc is used to present the latency >>> cycles

[PATCH V2 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key

2021-12-02 Thread Athira Rajeev
is to list of dynamic sort keys and made the "dynamic_headers" and "arch_specific_sort_keys" as static. Signed-off-by: Athira Rajeev Reported-by: Namhyung Kim --- Changelog: v1 -> v2: Addressed review comments from Jiri by making the "dynamic_headers" and "arch_s

[PATCH V2 2/2] tools/perf: Update global/local variants for p_stage_cyc in powerpc

2021-12-02 Thread Athira Rajeev
Update the arch_support_sort_key() function in powerpc to enable presenting local and global variants of sort key: p_stage_cyc. Update the "se_header" strings for these in arch_perf_header_entry() function along with instruction latency. Signed-off-by: Athira Rajeev Reported-by: Na

Re: [PATCH 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key

2021-11-30 Thread Athira Rajeev
> On 29-Nov-2021, at 10:41 PM, Jiri Olsa wrote: > > On Thu, Nov 25, 2021 at 08:18:50AM +0530, Athira Rajeev wrote: >> Sort key p_stage_cyc is used to present the latency >> cycles spend in pipeline stages. perf tool has local >> p_stage_cyc sort key to display this

Re: [PATCH 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key

2021-11-29 Thread Athira Rajeev
> On 28-Nov-2021, at 10:04 PM, Jiri Olsa wrote: > > On Thu, Nov 25, 2021 at 08:18:50AM +0530, Athira Rajeev wrote: >> Sort key p_stage_cyc is used to present the latency >> cycles spend in pipeline stages. perf tool has local >> p_stage_cyc sort key to display this

[PATCH 2/2] tools/perf: Update global/local variants for p_stage_cyc in powerpc

2021-11-24 Thread Athira Rajeev
Update the arch_support_sort_key() function in powerpc to enable presenting local and global variants of sort key: p_stage_cyc. Update the "se_header" strings for these in arch_perf_header_entry() function along with instruction latency. Signed-off-by: Athira Rajeev Reported-by: Na

[PATCH 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key

2021-11-24 Thread Athira Rajeev
is to list of dynamic sort keys. Signed-off-by: Athira Rajeev Reported-by: Namhyung Kim --- tools/perf/util/hist.c | 4 +++- tools/perf/util/hist.h | 3 ++- tools/perf/util/sort.c | 34 +- tools/perf/util/sort.h | 3 ++- 4 files changed, 32 insertions(+), 12 deletion

[PATCH] powerpc/perf: Fix task context setting for trace imc

2021-11-23 Thread Athira Rajeev
sw_context inorder to be able to do application level monitoring. Hence change the task_ctx_nr to use perf_sw_context. Fixes: 012ae244845f ("powerpc/perf: Trace imc PMU functions") Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan --- arch/powerpc/perf/imc-pmu.c | 2 +- 1 fi

Re: [PATCH V4 0/1] powerpc/perf: Clear pending PMI in ppmu callbacks

2021-11-19 Thread Athira Rajeev
> On 21-Jul-2021, at 11:18 AM, Athira Rajeev > wrote: > > Running perf fuzzer testsuite popped up below messages > in the dmesg logs: > > "Can't find PMC that caused IRQ" > > This means a PMU exception happened, but none of the PMC's (Perf

Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-11-09 Thread Athira Rajeev
On 04-Nov-2021, at 11:25 AM, Michael Ellerman wrote:Nathan Lynch writes:Nicholas Piggin writes:Excerpts from Michael Ellerman's message of October 29, 2021 11:15 pm:Nicholas Piggin writes:Excerpts from Athira

Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-11-02 Thread Athira Rajeev
problem >>> during updation of event->count since we always accumulate >>> (event->hw.prev_count - PMC value) in event->count. If >>> event->hw.prev_count is greater PMC value, event->count becomes >>> negative. To fix this, 'prev_count' also needs

Re: [V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-11-02 Thread Athira Rajeev
is causes problem >> during updation of event->count since we always accumulate >> (event->hw.prev_count - PMC value) in event->count. If >> event->hw.prev_count is greater PMC value, event->count becomes >> negative. To fix this, 'prev_count' also need

[V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-10-28 Thread Athira Rajeev
the events. Hence read the existing events and clear the PMC index (stored in event->hw.idx) for all events im mobility_pmu_disable. By this way, event count settings will get re-initialised correctly in power_pmu_enable. Signed-off-by: Athira Rajeev [ Fixed compilation error reported by kerne

Re: [PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-10-25 Thread Athira Rajeev
> On 21-Oct-2021, at 11:03 PM, Nathan Lynch wrote: > > Nicholas Piggin mailto:npig...@gmail.com>> writes: >> Excerpts from Athira Rajeev's message of July 11, 2021 10:25 pm: >>> During Live Partition Migration (LPM), it is observed that perf >>> counter values reports zero post migration

Re: [PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-10-25 Thread Athira Rajeev
> On 21-Oct-2021, at 10:47 PM, Nathan Lynch wrote: > > Athira Rajeev <mailto:atraj...@linux.vnet.ibm.com>> writes: >> During Live Partition Migration (LPM), it is observed that perf >> counter values reports zero post migration completion. However >>

Re: [PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-10-25 Thread Athira Rajeev
omes >> negative. Fix this by re-initialising 'prev_count' also for all >> events while enabling back the events. A new variable 'migrate' is >> introduced in 'struct cpu_hw_event' to achieve this for LPM cases >> in power_pmu_enable. Use the 'migrate' value to clear the PMC &g

Re: [V4 0/2] tools/perf: Add instruction and data address registers to extended regs in powerpc

2021-10-20 Thread Athira Rajeev
> On 19-Oct-2021, at 10:00 PM, Arnaldo Carvalho de Melo wrote: > > Em Mon, Oct 18, 2021 at 05:19:46PM +0530, Athira Rajeev escreveu: >> Patch set adds PMU registers namely Sampled Instruction Address Register >> (SIAR) and Sampled Data Address Register (SDAR) as

[V4 1/2] tools/perf: Refactor the code definition of perf reg extended mask in tools side header file

2021-10-18 Thread Athira Rajeev
the actual register value constants. Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev Reviewed-by: Kajol Jain --- .../arch/powerpc/include/uapi/asm/perf_regs.h | 21 --- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/tools/arch/powerpc/include/uapi/asm

[V4 2/2] tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs

2021-10-18 Thread Athira Rajeev
Patch enables presenting of Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended regsiters for perf tool. Add these SPR's to sample_reg_mask in the tool side (to use with -I? option). Signed-off-by: Athira Rajeev Reviewed-by: Kajol Jain

[V4 0/2] tools/perf: Add instruction and data address registers to extended regs in powerpc

2021-10-18 Thread Athira Rajeev
d mask value macros for PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 to make it more readable. Also moved PERF_REG_EXTENDED_MAX along with enum definition similar to PERF_REG_POWERPC_MAX. Athira Rajeev (2): tools/perf: Refactor the code definition of perf reg extended mask in too

[V2] powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10

2021-10-07 Thread Athira Rajeev
From: Athira Rajeev In power9 and before platforms, the default event used for cyles and instructions is PM_CYC (0x0001e) and PM_INST_CMPL (0x2) respectively. These events uses two programmable PMCs and by default will count irrespective of the run latch state. But since it is using

[V3 4/4] tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs

2021-10-07 Thread Athira Rajeev
Patch enables presenting of Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended regsiters for perf tool. Add these SPR's to sample_reg_mask in the tool side (to use with -I? option). Signed-off-by: Athira Rajeev --- tools/arch/powerpc

[V3 3/4] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-10-07 Thread Athira Rajeev
Patch adds support to include Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended registers. Update the definition of PERF_REG_PMU_MASK_300/31 and PERF_REG_EXTENDED_MAX to include these SPR's. Signed-off-by: Athira Rajeev Reviewed

[V3 2/4] tools/perf: Refactor the code definition of perf reg extended mask in tools side header file

2021-10-07 Thread Athira Rajeev
the actual register value constants. Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev --- .../arch/powerpc/include/uapi/asm/perf_regs.h | 21 --- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch

[V3 1/4] powerpc/perf: Refactor the code definition of perf reg extended mask

2021-10-07 Thread Athira Rajeev
constants. Also include PERF_REG_EXTENDED_MAX as part of enum definition. Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev --- arch/powerpc/include/uapi/asm/perf_regs.h | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/uapi/asm

[V3 0/4] powerpc/perf: Add instruction and data address registers to extended regs

2021-10-07 Thread Athira Rajeev
Michael Ellerman - Refactored the perf reg extended mask value macros for PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 to make it more readable. Also moved PERF_REG_EXTENDED_MAX along with enum definition similar to PERF_REG_POWERPC_MAX. Athira Rajeev (4): powerpc/perf: Refactor the code de

Re: [V2 2/4] tools/perf: Refactor the code definition of perf reg extended mask in tools side header file

2021-10-01 Thread Athira Rajeev
> On 01-Oct-2021, at 11:50 AM, Daniel Axtens wrote: > > Hi Athira, > >> PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 defines the mask >> value for extended registers. Current definition of these mask values >> uses hex constant and does not use registers by name, making it less >>

[V2 3/4] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-09-30 Thread Athira Rajeev
Patch adds support to include Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended registers. Update the definition of PERF_REG_PMU_MASK_300/31 and PERF_REG_EXTENDED_MAX to include these SPR's. Signed-off-by: Athira Rajeev --- arch

[V2 2/4] tools/perf: Refactor the code definition of perf reg extended mask in tools side header file

2021-09-30 Thread Athira Rajeev
the actual register value constants. Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev --- .../arch/powerpc/include/uapi/asm/perf_regs.h | 21 --- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch

[V2 4/4] tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs

2021-09-30 Thread Athira Rajeev
Patch enables presenting of Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended regsiters for perf tool. Add these SPR's to sample_reg_mask in the tool side (to use with -I? option). Signed-off-by: Athira Rajeev --- tools/arch/powerpc

[V2 0/4] powerpc/perf: Add instruction and data address registers to extended regs

2021-09-30 Thread Athira Rajeev
-> v2: Addressed review comments from Michael Ellerman - Refactored the perf reg extended mask value macros for PERF_REG_PMU_MASK_300 and PERF_REG_PMU_MASK_31 to make it more readable. Also moved PERF_REG_EXTENDED_MAX along with enum definition similar to PERF_REG_POWERPC_MAX. Athira Raj

[V2 1/4] powerpc/perf: Refactor the code definition of perf reg extended mask

2021-09-30 Thread Athira Rajeev
constants. Also include PERF_REG_EXTENDED_MAX as part of enum definition. Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev --- arch/powerpc/include/uapi/asm/perf_regs.h | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/uapi/asm

Re: [PATCH 1/2] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-09-20 Thread Athira Rajeev
> On 20-Sep-2021, at 12:43 PM, Michael Ellerman wrote: > > Athira Rajeev writes: >>> On 08-Sep-2021, at 10:47 AM, Michael Ellerman wrote: >>> >>> Athira Rajeev writes: >>>> Patch adds support to include Sampled Instruction Address Regist

Re: [PATCH 1/2] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-09-08 Thread Athira Rajeev
> On 08-Sep-2021, at 10:47 AM, Michael Ellerman wrote: > > Athira Rajeev writes: >> Patch adds support to include Sampled Instruction Address Register >> (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended >> registers. Update the definition

Re: [PATCH 0/2] powerpc/perf: Add instruction and data address registers to extended regs

2021-09-05 Thread Athira Rajeev
> On 02-Sep-2021, at 1:04 PM, kajoljain wrote: > > > > On 6/20/21 8:15 PM, Athira Rajeev wrote: >> Patch set adds PMU registers namely Sampled Instruction Address Register >> (SIAR) and Sampled Data Address Register (SDAR) as part of extended regs >> in

Re: [PATCH v1 2/4] powerpc/64s/perf: add power_pmu_running to query whether perf is being used

2021-08-18 Thread Athira Rajeev
;> know whether it should enable MSR[EE] to improve PMI coverage. >>> >>> Cc: Madhavan Srinivasan >>> Cc: Athira Rajeev >>> Signed-off-by: Nicholas Piggin >>> --- >>> arch/powerpc/include/asm/hw_irq.h | 2 ++ >>> arch/powe

Re: [PATCH v2 00/60] KVM: PPC: Book3S HV P9: entry/exit optimisations

2021-08-16 Thread Athira Rajeev
demand faulting bug causing nested guest TM tests to TM Bad > Thing the host in rare cases. > - Re-name new "pmu=" command line option to "pmu_override=" and update > documentation wording. Hi Nick, For the PMU related changes, Reviewed-by: Athira Rajeev Thanks Athi

Re: [PATCH v1 17/55] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C

2021-08-14 Thread Athira Rajeev
lear our PMU register contents. > >> Also can’t we unconditionally do the MMCR0/MMCRA/ freeze settings in here ? >> do we need the if conditions for FC/PMCCEXT/BHRB ? > > I think it's possible, but pretty minimal advantage. I would prefer to > set them the way perf does for now. Sure Nick, Other changes looks good to me. Reviewed-by: Athira Rajeev Thanks Athira > If we can move this code into perf/ > it should become easier for you to tweak things. > > Thanks, > Nick

Re: [PATCH v1 16/55] powerpc/64s: Implement PMU override command line option

2021-08-11 Thread Athira Rajeev
For advanced users, the option to pass value for MMCR1 is fine. But other >> cases, it could result in >> invalid event getting used. Do we need to restrict this boot time option for >> only PMC5/6 ? > > Depends what would be useful. We don't have to prevent the admin shooting > themselves in the foot with options like this, but if we can make it > safer without making it less useful then that's always a good option. Hi Nick I checked back on my comment and it will be difficult to add/maintain validity check for MMCR1 considering different platforms that we have. We can go ahead with present approach you have in this patch. Changes looks good to me. Reviewed-by: Athira Rajeev > > Thanks, > Nick

Re: [PATCH v1 17/55] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C

2021-08-08 Thread Athira Rajeev
> On 26-Jul-2021, at 9:19 AM, Nicholas Piggin wrote: > > Implement the P9 path PMU save/restore code in C, and remove the > POWER9/10 code from the P7/8 path assembly. > > -449 cycles (8533) POWER9 virt-mode NULL hcall > > Signed-off-by: Nicholas Piggin > --- >

Re: [PATCH v1 16/55] powerpc/64s: Implement PMU override command line option

2021-08-06 Thread Athira Rajeev
> On 26-Jul-2021, at 9:19 AM, Nicholas Piggin wrote: > > It can be useful in simulators (with very constrained environments) > to allow some PMCs to run from boot so they can be sampled directly > by a test harness, rather than having to run perf. > > A previous change freezes counters at

[PATCH V4 0/1] powerpc/perf: Clear pending PMI in ppmu callbacks

2021-07-20 Thread Athira Rajeev
and clearing function to arch/powerpc/include/asm/hw_irq.h and renamed function to "get_clear_pmi_irq_pending" - Along with checking for pending PMI bit in Paca, look for PMAO bit in MMCR0 register to decide on pending PMI interrupt. Athira Rajeev (1): powerpc

[PATCH V4 1/1] powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC

2021-07-20 Thread Athira Rajeev
ook3s on a race condition which can trigger these PMC messages during idle path in PowerNV. Fixes: f442d004806e ("powerpc/64s: Add support to mask perf interrupts and replay them") Reported-by: Nageswara R Sastry Suggested-by: Nicholas Piggin Suggested-by: Madhavan Sriniva

Re: [PATCH] powerpc/64s/perf: Always use SIAR for kernel interrupts

2021-07-20 Thread Athira Rajeev
ay 3.90% scvonly [kernel.vmlinux][k] replay_soft_interrupts Samples were present around interrupt replay code. After the fix, perf report didn’t had samples pointing to replay code. Tested-by: Athira Rajeev Thanks Athira > else if (!(ppmu->flags & PPMU_NO_SIPR) && regs_sipr(regs)) > use_siar = 0; > else > -- > 2.23.0 >

Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C

2021-07-12 Thread Athira Rajeev
> On 12-Jul-2021, at 8:19 AM, Nicholas Piggin wrote: > > Excerpts from Athira Rajeev's message of July 10, 2021 12:47 pm: >> >> >>> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin wrote: >>> >>> Implement the P9 path PMU save/restore code in C, and remove the >>> POWER9/10 code from the P7/8

Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use

2021-07-12 Thread Athira Rajeev
On 12-Jul-2021, at 8:11 AM, Nicholas Piggin wrote:Excerpts from Athira Rajeev's message of July 10, 2021 12:50 pm:On 22-Jun-2021, at 4:27 PM, Nicholas Piggin wrote:KVM PMU management code looks for particular frozen/disabled bits inthe PMU registers so it knows whether it must clear them when

Re: [PATCH] powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10

2021-07-12 Thread Athira Rajeev
> On 08-Jul-2021, at 9:13 PM, Paul A. Clarke wrote: > > On Thu, Jul 08, 2021 at 10:56:57PM +1000, Nicholas Piggin wrote: >> Excerpts from Athira Rajeev's message of July 7, 2021 4:39 pm: >>> From: Athira Rajeev >>> >>> Power10 performance mon

Re: [PATCH] powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10

2021-07-12 Thread Athira Rajeev
> On 08-Jul-2021, at 6:26 PM, Nicholas Piggin wrote: > > Excerpts from Athira Rajeev's message of July 7, 2021 4:39 pm: >> From: Athira Rajeev >> >> Power10 performance monitoring unit (PMU) driver uses performance >> monitor counter 5 (PMC5) and pe

Re: [PATCH V3 1/1] powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC

2021-07-12 Thread Athira Rajeev
in since PMAO bit is still set. But fails >> to find valid overflow since PMC get cleared in power_pmu_del. Patch >> fixes this by disabling PMXE along with disabling of other MMCR0 bits >> in power_pmu_disable. >> >> We can't just replay PMI any time. Hence t

[PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-07-11 Thread Athira Rajeev
g back the events. A new variable 'migrate' is introduced in 'struct cpu_hw_event' to achieve this for LPM cases in power_pmu_enable. Use the 'migrate' value to clear the PMC index (stored in event->hw.idx) for all events so that event count settings will get re-initialised correctly. Signed-off

[PATCH] powerpc/perf: Enable PMU counters post partition migration if PMU is active

2021-07-11 Thread Athira Rajeev
pu_hw_event' to achieve this for LPM cases in power_pmu_enable. Use the 'migrate' value to clear the PMC index (stored in event->hw.idx) for all events so that event count settings will get re-initialised correctly. Signed-off-by: Athira Rajeev --- arch/powerpc/include/asm/rtas.h | 4 +++ a

[PATCH V3 0/1] powerpc/perf: Clear pending PMI in ppmu callbacks

2021-07-10 Thread Athira Rajeev
uot; - Along with checking for pending PMI bit in Paca, look for PMAO bit in MMCR0 register to decide on pending PMI interrupt. Athira Rajeev (1): powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC arch/powerpc/include/asm/hw_irq.h | 31

[PATCH V3 1/1] powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC

2021-07-10 Thread Athira Rajeev
/64s: Add support to mask perf interrupts and replay them") Reported-by: Nageswara R Sastry Suggested-by: Nicholas Piggin Suggested-by: Madhavan Srinivasan Signed-off-by: Athira Rajeev --- arch/powerpc/include/asm/hw_irq.h | 31 + arch/powerpc/perf/core-book3s.c | 49

Re: [RFC PATCH 10/43] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use

2021-07-09 Thread Athira Rajeev
> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin wrote: > > KVM PMU management code looks for particular frozen/disabled bits in > the PMU registers so it knows whether it must clear them when coming > out of a guest or not. Setting this up helps KVM make these optimisations > without getting

Re: [RFC PATCH 11/43] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C

2021-07-09 Thread Athira Rajeev
> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin wrote: > > Implement the P9 path PMU save/restore code in C, and remove the > POWER9/10 code from the P7/8 path assembly. > > -449 cycles (8533) POWER9 virt-mode NULL hcall > > Signed-off-by: Nicholas Piggin > --- >

Re: [RFC PATCH 27/43] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in

2021-07-07 Thread Athira Rajeev
> On 22-Jun-2021, at 4:27 PM, Nicholas Piggin wrote: > > Move the P9 guest/host register switching functions to the built-in > P9 entry code, and export it for nested to use as well. > > This allows more flexibility in scheduling these supervisor privileged > SPR accesses with the HV

[PATCH] powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10

2021-07-07 Thread Athira Rajeev
From: Athira Rajeev Power10 performance monitoring unit (PMU) driver uses performance monitor counter 5 (PMC5) and performance monitor counter 6 (PMC6) for counting instructions and cycles. Event used for cycles is PM_RUN_CYC and instructions is PM_RUN_INST_CMPL. But counting of these events

[PATCH 0/2] powerpc/perf: Add instruction and data address registers to extended regs

2021-06-20 Thread Athira Rajeev
and SDAR as part of the extended regs mask. Patch 2/2 includes perf tools side changes to add the SPRs to sample_reg_mask to use with -I? option. Athira Rajeev (2): powerpc/perf: Expose instruction and data address registers as part of extended regs tools/perf: Add perf tools support to expose

[PATCH 1/2] powerpc/perf: Expose instruction and data address registers as part of extended regs

2021-06-20 Thread Athira Rajeev
Patch adds support to include Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended registers. Update the definition of PERF_REG_PMU_MASK_300/31 and PERF_REG_EXTENDED_MAX to include these SPR's. Signed-off-by: Athira Rajeev --- arch

[PATCH 2/2] tools/perf: Add perf tools support to expose instruction and data address registers as part of extended regs

2021-06-20 Thread Athira Rajeev
Patch enables presenting of Sampled Instruction Address Register (SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended regsiters for perf tool. Add these SPR's to sample_reg_mask in the tool side (to use with -I? option). Signed-off-by: Athira Rajeev --- tools/arch/powerpc

[PATCH] powerpc/perf: Fix crash with 'perf_instruction_pointer' when pmu is not set

2021-06-17 Thread Athira Rajeev
ernel.org Signed-off-by: Athira Rajeev Reported-by: Christophe Leroy Tested-by: Christophe Leroy --- arch/powerpc/perf/core-book3s.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 16d4d1b..5162241 1

Re: Oops (NULL pointer) with 'perf record' of selftest 'null_syscall'

2021-06-17 Thread Athira Rajeev
> On 17-Jun-2021, at 10:05 PM, Christophe Leroy > wrote: > > > > Le 17/06/2021 à 08:36, Athira Rajeev a écrit : >>> On 16-Jun-2021, at 11:56 AM, Christophe Leroy >>> wrote: >>> >>> >>> >>> Le 16/06/2021 à 0

Re: Oops (NULL pointer) with 'perf record' of selftest 'null_syscall'

2021-06-17 Thread Athira Rajeev
On 16-Jun-2021, at 11:56 AM, Christophe Leroy wrote:Le 16/06/2021 à 05:40, Athira Rajeev a écrit :On 16-Jun-2021, at 8:53 AM, Madhavan Srinivasan wrote:On 6/15/21 8:35 PM, Christophe Leroy wrote:For your information, I'm getting the following Oops. Detected with 5.13-rc6, it also oopses on 5.12

Re: Oops (NULL pointer) with 'perf record' of selftest 'null_syscall'

2021-06-15 Thread Athira Rajeev
er10 DD1 which has caused this breakage. My bad. We > are working on a fix patch > for the same and will post it out. Sorry again. > Hi Christophe, Can you please try with below patch in your environment and test if it works for you. From 55d3afc9369dfbe28a7152c8e9f856c11c7fe43d Mon

[PATCH V3 2/2] selftests/powerpc: EBB selftest for MMCR0 control for PMU SPRs in ISA v3.1

2021-05-25 Thread Athira Rajeev
attempting to read PMU registers via helper function "dump_ebb_state" for ISA v3.1. Signed-off-by: Athira Rajeev --- tools/testing/selftests/powerpc/pmu/ebb/Makefile | 2 +- .../powerpc/pmu/ebb/regs_access_pmccext_test.c | 63 ++ 2 files changed, 64 insertions(+),

[PATCH V3 1/2] selftests/powerpc: Fix "no_handler" EBB selftest

2021-05-25 Thread Athira Rajeev
as to dump the state of registers at the end of the test when the counters are frozen. But this will be achieved with the first call itself since sample period is set to low value and PMU will be frozen by then. Hence patch removes the dump which was done before closing of the event. Signed-off-by: Ath

<    1   2   3   4   5   6   7   8   >