Re: [PATCH] ARC: perf: Accommodate big-endian CPU
On Wed, Nov 27, 2019 at 11:01:23AM +0300, Alexey Brodkin wrote: > 8-letter strings representing ARC perf events are stores in two > 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc. > > And the same order of bytes in the word is used regardless CPU endianness. > > Which means in case of big-endian CPU core we need to swap bytes to get > the same order as if it was on little-endian CPU. > > Otherwise we're seeing the following error message on boot: > ->8-- > ARC perf: 8 counters (32 bits), 40 conditions, [overflow IRQ support] > sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 > Stack Trace: > arc_unwind_core+0xd4/0xfc > dump_stack+0x64/0x80 > sysfs_warn_dup+0x46/0x58 > sysfs_add_file_mode_ns+0xb2/0x168 > create_files+0x70/0x2a0 > [ cut here ] > WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 > perf_event_sysfs_init+0x70/0xa0 > Failed to register pmu: arc_pct, reason -17 > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 > Stack Trace: > arc_unwind_core+0xd4/0xfc > dump_stack+0x64/0x80 > __warn+0x9c/0xd4 > warn_slowpath_fmt+0x22/0x2c > perf_event_sysfs_init+0x70/0xa0 > ---[ end trace a75fb9a9837bd1ec ]--- > ->8-- > > What happens here we're trying to register more than one raw perf event > with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters > and encoded into two 32-bit words. In this particular case we deal with 2 > events: > * "IJMP" which counts all jump & branch instructions > * "IJMPC___" which counts only conditional jumps & branches > > Those strings are split in two 32-bit words this way "IJMP" + "" & > "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core > being big-endian then we read "PMJI" + "" & "PMJI" + "___C". > > And since we interpret read array of ASCII letters as a null-terminated string > on big-endian CPU we end up with 2 events of the same name "PMJI". > > Signed-off-by: Alexey Brodkin > Cc: sta...@vger.kernel.org > --- > > Greg, Sasha, this is the same patch as > commit 5effc09c4907 ("ARC: perf: Accommodate big-endian CPU") > but fine-tuned to be applicable to kernels 4.19 and older. Thanks, now queued up. greg k-h ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
RE: [PATCH] ARC: perf: Accommodate big-endian CPU
Hi Greg, > -Original Message- > From: linux-snps-arc On Behalf > Of Greg Kroah-Hartman > Sent: Thursday, November 21, 2019 11:40 PM > To: Alexey Brodkin > Cc: Sasha Levin ; linux-snps-arc@lists.infradead.org; > linux-ker...@vger.kernel.org; > sta...@vger.kernel.org > Subject: Re: [PATCH] ARC: perf: Accommodate big-endian CPU > > On Tue, Nov 05, 2019 at 07:52:16PM +, Alexey Brodkin wrote: > > Hi Sasha, Greg, > > > > > -Original Message- > > > From: Sasha Levin > > > Sent: Saturday, October 26, 2019 4:11 PM > > > To: Sasha Levin ; Alexey Brodkin > > > ; linux-snps- > > > a...@lists.infradead.org > > > Cc: linux-ker...@vger.kernel.org; sta...@vger.kernel.org; > > > sta...@vger.kernel.org > > > Subject: Re: [PATCH] ARC: perf: Accommodate big-endian CPU > > > > > > Hi, > > > > > > [This is an automated email] > > > > > > This commit has been processed because it contains a -stable tag. > > > The stable tag indicates that it's relevant for the following trees: all > > > > > > The bot has tested the following trees: v5.3.7, v4.19.80, v4.14.150, > > > v4.9.197, v4.4.197. > > > > > > v5.3.7: Build OK! > > > v4.19.80: Failed to apply! Possible dependencies: > > > 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") > > > 14f81a91ad29a ("ARC: perf: trivial code cleanup") > > > baf9cc85ba01f ("ARC: perf: move HW events mapping to separate > > > function") > > > v4.14.150: Failed to apply! Possible dependencies: > > > v4.9.197: Failed to apply! Possible dependencies: > > > v4.4.197: Failed to apply! Possible dependencies: > > > > Indeed the clash is due to > > commit baf9cc85ba01f ("ARC: perf: move HW events mapping to separate > > function") as tmp variable "j" > was changed on "i". So that's a fixed hunk: > > >8-- > > diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c > > index 8aec462d90fb..30f66b123541 100644 > > --- a/arch/arc/kernel/perf_event.c > > +++ b/arch/arc/kernel/perf_event.c > > @@ -490,8 +490,8 @@ static int arc_pmu_device_probe(struct platform_device > > *pdev) > > /* loop thru all available h/w condition indexes */ > > for (j = 0; j < cc_bcr.c; j++) { > > write_aux_reg(ARC_REG_CC_INDEX, j); > > - cc_name.indiv.word0 = read_aux_reg(ARC_REG_CC_NAME0); > > - cc_name.indiv.word1 = read_aux_reg(ARC_REG_CC_NAME1); > > + cc_name.indiv.word0 = > > le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME0)); > > + cc_name.indiv.word1 = > > le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME1)); > > > > /* See if it has been mapped to a perf event_id */ > > for (i = 0; i < ARRAY_SIZE(arc_pmu_ev_hw_map); i++) { > > >8-- > > > > Should I send a formal patch with it or it's OK for now? > > We need a "formal" patch that we can apply if you want it applied. Done, see https://patchwork.ozlabs.org/patch/1201398/ and your inbox. -Alexey ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC: perf: Accommodate big-endian CPU
8-letter strings representing ARC perf events are stores in two 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc. And the same order of bytes in the word is used regardless CPU endianness. Which means in case of big-endian CPU core we need to swap bytes to get the same order as if it was on little-endian CPU. Otherwise we're seeing the following error message on boot: ->8-- ARC perf: 8 counters (32 bits), 40 conditions, [overflow IRQ support] sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 sysfs_warn_dup+0x46/0x58 sysfs_add_file_mode_ns+0xb2/0x168 create_files+0x70/0x2a0 [ cut here ] WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0 Failed to register pmu: arc_pct, reason -17 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 __warn+0x9c/0xd4 warn_slowpath_fmt+0x22/0x2c perf_event_sysfs_init+0x70/0xa0 ---[ end trace a75fb9a9837bd1ec ]--- ->8-- What happens here we're trying to register more than one raw perf event with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters and encoded into two 32-bit words. In this particular case we deal with 2 events: * "IJMP" which counts all jump & branch instructions * "IJMPC___" which counts only conditional jumps & branches Those strings are split in two 32-bit words this way "IJMP" + "" & "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core being big-endian then we read "PMJI" + "" & "PMJI" + "___C". And since we interpret read array of ASCII letters as a null-terminated string on big-endian CPU we end up with 2 events of the same name "PMJI". Signed-off-by: Alexey Brodkin Cc: sta...@vger.kernel.org --- Greg, Sasha, this is the same patch as commit 5effc09c4907 ("ARC: perf: Accommodate big-endian CPU") but fine-tuned to be applicable to kernels 4.19 and older. arch/arc/kernel/perf_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index 8aec462d90fb..30f66b123541 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -490,8 +490,8 @@ static int arc_pmu_device_probe(struct platform_device *pdev) /* loop thru all available h/w condition indexes */ for (j = 0; j < cc_bcr.c; j++) { write_aux_reg(ARC_REG_CC_INDEX, j); - cc_name.indiv.word0 = read_aux_reg(ARC_REG_CC_NAME0); - cc_name.indiv.word1 = read_aux_reg(ARC_REG_CC_NAME1); + cc_name.indiv.word0 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME0)); + cc_name.indiv.word1 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME1)); /* See if it has been mapped to a perf event_id */ for (i = 0; i < ARRAY_SIZE(arc_pmu_ev_hw_map); i++) { -- 2.16.2 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] ARC: perf: Accommodate big-endian CPU
On Tue, Nov 05, 2019 at 07:52:16PM +, Alexey Brodkin wrote: > Hi Sasha, Greg, > > > -Original Message- > > From: Sasha Levin > > Sent: Saturday, October 26, 2019 4:11 PM > > To: Sasha Levin ; Alexey Brodkin > > ; linux-snps- > > a...@lists.infradead.org > > Cc: linux-ker...@vger.kernel.org; sta...@vger.kernel.org; > > sta...@vger.kernel.org > > Subject: Re: [PATCH] ARC: perf: Accommodate big-endian CPU > > > > Hi, > > > > [This is an automated email] > > > > This commit has been processed because it contains a -stable tag. > > The stable tag indicates that it's relevant for the following trees: all > > > > The bot has tested the following trees: v5.3.7, v4.19.80, v4.14.150, > > v4.9.197, v4.4.197. > > > > v5.3.7: Build OK! > > v4.19.80: Failed to apply! Possible dependencies: > > 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") > > 14f81a91ad29a ("ARC: perf: trivial code cleanup") > > baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") > > v4.14.150: Failed to apply! Possible dependencies: > > v4.9.197: Failed to apply! Possible dependencies: > > v4.4.197: Failed to apply! Possible dependencies: > > Indeed the clash is due to > commit baf9cc85ba01f ("ARC: perf: move HW events mapping to separate > function") as tmp variable "j" was changed on "i". So that's a fixed hunk: > >8-- > diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c > index 8aec462d90fb..30f66b123541 100644 > --- a/arch/arc/kernel/perf_event.c > +++ b/arch/arc/kernel/perf_event.c > @@ -490,8 +490,8 @@ static int arc_pmu_device_probe(struct platform_device > *pdev) > /* loop thru all available h/w condition indexes */ > for (j = 0; j < cc_bcr.c; j++) { > write_aux_reg(ARC_REG_CC_INDEX, j); > - cc_name.indiv.word0 = read_aux_reg(ARC_REG_CC_NAME0); > - cc_name.indiv.word1 = read_aux_reg(ARC_REG_CC_NAME1); > + cc_name.indiv.word0 = > le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME0)); > + cc_name.indiv.word1 = > le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME1)); > > /* See if it has been mapped to a perf event_id */ > for (i = 0; i < ARRAY_SIZE(arc_pmu_ev_hw_map); i++) { > >8-- > > Should I send a formal patch with it or it's OK for now? We need a "formal" patch that we can apply if you want it applied. thanks, greg k-h ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
RE: [PATCH] ARC: perf: Accommodate big-endian CPU
Hi Sasha, Greg, > -Original Message- > From: Sasha Levin > Sent: Saturday, October 26, 2019 4:11 PM > To: Sasha Levin ; Alexey Brodkin ; > linux-snps- > a...@lists.infradead.org > Cc: linux-ker...@vger.kernel.org; sta...@vger.kernel.org; > sta...@vger.kernel.org > Subject: Re: [PATCH] ARC: perf: Accommodate big-endian CPU > > Hi, > > [This is an automated email] > > This commit has been processed because it contains a -stable tag. > The stable tag indicates that it's relevant for the following trees: all > > The bot has tested the following trees: v5.3.7, v4.19.80, v4.14.150, > v4.9.197, v4.4.197. > > v5.3.7: Build OK! > v4.19.80: Failed to apply! Possible dependencies: > 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") > 14f81a91ad29a ("ARC: perf: trivial code cleanup") > baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") > v4.14.150: Failed to apply! Possible dependencies: > v4.9.197: Failed to apply! Possible dependencies: > v4.4.197: Failed to apply! Possible dependencies: Indeed the clash is due to commit baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") as tmp variable "j" was changed on "i". So that's a fixed hunk: >8-- diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index 8aec462d90fb..30f66b123541 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -490,8 +490,8 @@ static int arc_pmu_device_probe(struct platform_device *pdev) /* loop thru all available h/w condition indexes */ for (j = 0; j < cc_bcr.c; j++) { write_aux_reg(ARC_REG_CC_INDEX, j); - cc_name.indiv.word0 = read_aux_reg(ARC_REG_CC_NAME0); - cc_name.indiv.word1 = read_aux_reg(ARC_REG_CC_NAME1); + cc_name.indiv.word0 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME0)); + cc_name.indiv.word1 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME1)); /* See if it has been mapped to a perf event_id */ for (i = 0; i < ARRAY_SIZE(arc_pmu_ev_hw_map); i++) { >8-- Should I send a formal patch with it or it's OK for now? -Alexey ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] ARC: perf: Accommodate big-endian CPU
Hi, [This is an automated email] This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all The bot has tested the following trees: v5.3.7, v4.19.80, v4.14.150, v4.9.197, v4.4.197. v5.3.7: Build OK! v4.19.80: Failed to apply! Possible dependencies: 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") 14f81a91ad29a ("ARC: perf: trivial code cleanup") baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") v4.14.150: Failed to apply! Possible dependencies: 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") 14f81a91ad29a ("ARC: perf: trivial code cleanup") 4d431290402c8 ("ARCv2: perf: tweak overflow interrupt") baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") v4.9.197: Failed to apply! Possible dependencies: 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") 14f81a91ad29a ("ARC: perf: trivial code cleanup") 4d431290402c8 ("ARCv2: perf: tweak overflow interrupt") baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") v4.4.197: Failed to apply! Possible dependencies: 05c74e5e53f6c ("bpf: add bpf_skb_load_bytes helper") 0e956150fe09f ("ARC: perf: introduce Kernel PMU events support") 14f81a91ad29a ("ARC: perf: trivial code cleanup") 3379e0c3effa8 ("perf tools: Document the perf sysctls") 45d8390c56bd2 ("bpf: hash: move select_bucket() out of htab's spinlock") 538950a1b7527 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF") 568b329a02f75 ("perf: generalize perf_callchain") 6591f1e6662dd ("bpf: hash: use atomic count") 688ecfe602205 ("bpf: hash: use per-bucket spinlock") 75925e1ad7f5a ("perf/x86: Optimize stack walk user accesses") 781c53bc5d562 ("bpf: export helper function flags and reject invalid ones") 824bd0ce6c7c4 ("bpf: introduce BPF_MAP_TYPE_PERCPU_HASH map") a10423b87a7ea ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map") baf9cc85ba01f ("ARC: perf: move HW events mapping to separate function") c5dfd78eb7985 ("perf core: Allow setting up max frame stack depth via sysctl") c6c33454072fc ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key") cfbcf468454ab ("perf core: Pass max stack as a perf_callchain_entry context") d5a3b1f691865 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE") e32ea7e747271 ("soreuseport: fast reuseport UDP socket selection") ef456144da8ef ("soreuseport: define reuseport groups") f8ffad69c9f8b ("bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions") NOTE: The patch will not be queued to stable trees until it is upstream. How should we proceed with this patch? -- Thanks, Sasha ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC: perf: Accommodate big-endian CPU
8-letter strings representing ARC perf events are stores in two 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc. And the same order of bytes in the word is used regardless CPU endianness. Which means in case of big-endian CPU core we need to swap bytes to get the same order as if it was on little-endian CPU. Otherwise we're seeing the following error message on boot: ->8-- ARC perf: 8 counters (32 bits), 40 conditions, [overflow IRQ support] sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 sysfs_warn_dup+0x46/0x58 sysfs_add_file_mode_ns+0xb2/0x168 create_files+0x70/0x2a0 [ cut here ] WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0 Failed to register pmu: arc_pct, reason -17 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 __warn+0x9c/0xd4 warn_slowpath_fmt+0x22/0x2c perf_event_sysfs_init+0x70/0xa0 ---[ end trace a75fb9a9837bd1ec ]--- ->8-- What happens here we're trying to register more than one raw perf event with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters and encoded into two 32-bit words. In this particular case we deal with 2 events: * "IJMP" which counts all jump & branch instructions * "IJMPC___" which counts only conditional jumps & branches Those strings are split in two 32-bit words this way "IJMP" + "" & "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core being big-endian then we read "PMJI" + "" & "PMJI" + "___C". And since we interpret read array of ASCII letters as a null-terminated string on big-endian CPU we end up with 2 events of the same name "PMJI". Signed-off-by: Alexey Brodkin Cc: sta...@vger.kernel.org --- arch/arc/kernel/perf_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c index 861a8aea51f9..661fd842ea97 100644 --- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -614,8 +614,8 @@ static int arc_pmu_device_probe(struct platform_device *pdev) /* loop thru all available h/w condition indexes */ for (i = 0; i < cc_bcr.c; i++) { write_aux_reg(ARC_REG_CC_INDEX, i); - cc_name.indiv.word0 = read_aux_reg(ARC_REG_CC_NAME0); - cc_name.indiv.word1 = read_aux_reg(ARC_REG_CC_NAME1); + cc_name.indiv.word0 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME0)); + cc_name.indiv.word1 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME1)); arc_pmu_map_hw_event(i, cc_name.str); arc_pmu_add_raw_event_attr(i, cc_name.str); -- 2.16.2 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc