Re: [tip:sched/core] sched/x86: Optimize switch_mm() for multi-threaded workloads

2013-08-02 Thread Joe Mario
2013 22:14:21 -0400 Committer: Ingo Molnar mi...@kernel.org CommitDate: Thu, 1 Aug 2013 09:10:26 +0200 sched/x86: Optimize switch_mm() for multi-threaded workloads Dick Fowles, Don Zickus and Joe Mario have been working on improvements to perf, and noticed heavy cache line contention

Re: [kallsyms] general protection fault: 0000 [#1] PREEMPT DEBUG_PAGEALLOC

2013-11-11 Thread Joe Mario
On 11/11/2013 07:07 AM, Michal Marek wrote: On 10.11.2013 16:23, Andi Kleen wrote: On Sun, Nov 10, 2013 at 05:40:05PM +0800, Fengguang Wu wrote: Hi Joe, FYI. Here is another bisect result. I bet it's that strncpy() in kallsyms.c and someone passing in a too short buffer on a 32bit kernel.

Re: [PATCH 2/2] kallsyms: Revert back to 128 max symbol length

2013-11-12 Thread Joe Mario
On 11/11/2013 09:17 AM, Andi Kleen wrote: On Mon, Nov 11, 2013 at 02:40:36PM +0100, Michal Marek wrote: This reverts commits f3462aa (Kbuild: Handle longer symbols in kallsyms.c) and eea0e9c (kbuild: Increase kallsyms max symbol length) except for the added overflow check. The reason is a

Re: [PATCH 08/19] perf c2c: Shared data analyser

2014-02-28 Thread Joe Mario
Apologies for the resend. My first msg contained html in it. On 02/28/2014 04:03 PM, Davidlohr Bueso wrote: On Fri, 2014-02-28 at 14:46 -0500, Don Zickus wrote: On Fri, Feb 28, 2014 at 11:08:59AM -0800, Andi Kleen wrote: Don Zickus dzic...@redhat.com writes: + +static const struct

Re: [PATCH 08/19] perf c2c: Shared data analyser

2014-03-03 Thread Joe Mario
On 03/03/2014 12:23 PM, Andi Kleen wrote: Hmm, so based on Andi's reply, I am assuming you are running on a Westmere (or Nehalem) due to the lack of mem-stores. If you don't have mem-stores, this tool isn't going to work. The tool can only detect contention when sampling reads _and_ writes to

Re: perf: Translating mmap2 ids into socket info?

2014-10-22 Thread Joe Mario
On 10/22/2014 12:45 PM, Peter Zijlstra wrote: On Wed, Oct 22, 2014 at 12:20:26PM -0400, Don Zickus wrote: Hi, A question/request came up during our cache to cache analysis. We were wondering if give an unique mmap2 id (major, minor, inode, inode generation), if it was possible to determine a

Re: [PATCH] perf tool: Fix ppid for synthesized fork events

2015-03-25 Thread Joe Mario
On 03/24/2015 05:12 PM, David Ahern wrote: On 3/24/15 2:10 PM, Don Zickus wrote: He does this with and without the patch. The difference is usually over 50% extra time with the patch for both the record timings and report timings.:-( I find that shocking. The patch only populates ppid and

Re: [Questions] perf c2c: What's the current status of perf c2c?

2015-12-09 Thread Joe Mario
[RESEND - this time w/o html junk] On 12/09/2015 04:34 AM, Peter Zijlstra wrote: On Wed, Dec 09, 2015 at 09:04:40AM +0100, Jiri Olsa wrote: On Wed, Dec 09, 2015 at 12:06:44PM +0800, Yunlong Song wrote: Hi, Don, I am interested in the perf c2c tool, which is introduced in:

Re: [Questions] perf c2c: What's the current status of perf c2c?

2015-12-09 Thread Joe Mario
On 12/09/2015 12:15 PM, Stephane Eranian wrote: If I recall the c2c tool is giving you more than the bouncing line. It shows you the offset inside the line and the participating CPUs. Correct. It shows much more than the bouncing line. Appended below is the output for running "perf c2c" on

Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function

2016-09-19 Thread Joe Mario
On 09/19/2016 01:15 PM, Nilay Vaish wrote: On 19 September 2016 at 08:09, Jiri Olsa wrote: diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h index 7f69bf9d789d..27c6bb5abafb 100644 --- a/tools/perf/util/mem-events.h +++ b/tools/perf/util/mem-events.h @@

Re: [PATCHv4 00/57] perf c2c: Add new tool to analyze cacheline contention on NUMA systems

2016-10-01 Thread Joe Mario
On 09/29/2016 05:19 AM, Peter Zijlstra wrote: What I want is a tool that maps memop events (any PEBS memops) back to a 'type::member' form and sorts on that. That doesn't rely on the PEBS 'Data Linear Address' field, as that is useless for dynamically allocated bits. Instead it would use the

Re: [tip:sched/core] sched/x86: Optimize switch_mm() for multi-threaded workloads

2013-08-02 Thread Joe Mario
: Ingo Molnar CommitDate: Thu, 1 Aug 2013 09:10:26 +0200 sched/x86: Optimize switch_mm() for multi-threaded workloads Dick Fowles, Don Zickus and Joe Mario have been working on improvements to perf, and noticed heavy cache line contention on the mm_cpumask, running linpack on a 60 core / 120 thread

Re: [kallsyms] general protection fault: 0000 [#1] PREEMPT DEBUG_PAGEALLOC

2013-11-11 Thread Joe Mario
On 11/11/2013 07:07 AM, Michal Marek wrote: On 10.11.2013 16:23, Andi Kleen wrote: On Sun, Nov 10, 2013 at 05:40:05PM +0800, Fengguang Wu wrote: Hi Joe, FYI. Here is another bisect result. I bet it's that strncpy() in kallsyms.c and someone passing in a too short buffer on a 32bit kernel.

Re: [PATCH 2/2] kallsyms: Revert back to 128 max symbol length

2013-11-12 Thread Joe Mario
On 11/11/2013 09:17 AM, Andi Kleen wrote: On Mon, Nov 11, 2013 at 02:40:36PM +0100, Michal Marek wrote: This reverts commits f3462aa (Kbuild: Handle longer symbols in kallsyms.c) and eea0e9c (kbuild: Increase kallsyms max symbol length) except for the added overflow check. The reason is a

Re: [PATCH 08/19] perf c2c: Shared data analyser

2014-02-28 Thread Joe Mario
Apologies for the resend. My first msg contained html in it. On 02/28/2014 04:03 PM, Davidlohr Bueso wrote: On Fri, 2014-02-28 at 14:46 -0500, Don Zickus wrote: On Fri, Feb 28, 2014 at 11:08:59AM -0800, Andi Kleen wrote: Don Zickus writes: + +static const struct perf_evsel_str_handler

Re: [PATCH 08/19] perf c2c: Shared data analyser

2014-03-03 Thread Joe Mario
On 03/03/2014 12:23 PM, Andi Kleen wrote: Hmm, so based on Andi's reply, I am assuming you are running on a Westmere (or Nehalem) due to the lack of mem-stores. If you don't have mem-stores, this tool isn't going to work. The tool can only detect contention when sampling reads _and_ writes to

Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics

2020-10-14 Thread Joe Mario
"RMT Load Hit" > > tools/perf/builtin-c2c.c | 83 +------- > 1 file changed, 18 insertions(+), 65 deletions(-) Hi Leo: I ran your patches through some perf c2c tests and it all looks good. I agree the new format of the "Shared Data Cache Line Table" makes more sense now. And it still holds together nicely when sorted on local HitMs (-d lcl). Thank you for doing this. Joe Tested-by: Joe Mario

Re: [PATCH] perf tool: Fix ppid for synthesized fork events

2015-03-25 Thread Joe Mario
On 03/24/2015 05:12 PM, David Ahern wrote: On 3/24/15 2:10 PM, Don Zickus wrote: He does this with and without the patch. The difference is usually over 50% extra time with the patch for both the record timings and report timings.:-( I find that shocking. The patch only populates ppid and

Re: perf: Translating mmap2 ids into socket info?

2014-10-22 Thread Joe Mario
On 10/22/2014 12:45 PM, Peter Zijlstra wrote: On Wed, Oct 22, 2014 at 12:20:26PM -0400, Don Zickus wrote: Hi, A question/request came up during our cache to cache analysis. We were wondering if give an unique mmap2 id (major, minor, inode, inode generation), if it was possible to determine a

Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function

2016-09-19 Thread Joe Mario
On 09/19/2016 01:15 PM, Nilay Vaish wrote: On 19 September 2016 at 08:09, Jiri Olsa wrote: diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h index 7f69bf9d789d..27c6bb5abafb 100644 --- a/tools/perf/util/mem-events.h +++ b/tools/perf/util/mem-events.h @@ -2,6 +2,10 @@

Re: [PATCHv4 00/57] perf c2c: Add new tool to analyze cacheline contention on NUMA systems

2016-10-01 Thread Joe Mario
On 09/29/2016 05:19 AM, Peter Zijlstra wrote: What I want is a tool that maps memop events (any PEBS memops) back to a 'type::member' form and sorts on that. That doesn't rely on the PEBS 'Data Linear Address' field, as that is useless for dynamically allocated bits. Instead it would use the

Re: [Questions] perf c2c: What's the current status of perf c2c?

2015-12-09 Thread Joe Mario
[RESEND - this time w/o html junk] On 12/09/2015 04:34 AM, Peter Zijlstra wrote: On Wed, Dec 09, 2015 at 09:04:40AM +0100, Jiri Olsa wrote: On Wed, Dec 09, 2015 at 12:06:44PM +0800, Yunlong Song wrote: Hi, Don, I am interested in the perf c2c tool, which is introduced in:

Re: [Questions] perf c2c: What's the current status of perf c2c?

2015-12-09 Thread Joe Mario
On 12/09/2015 12:15 PM, Stephane Eranian wrote: If I recall the c2c tool is giving you more than the bouncing line. It shows you the offset inside the line and the participating CPUs. Correct. It shows much more than the bouncing line. Appended below is the output for running "perf c2c" on

[tip:x86/asmlinkage] lto: Handle LTO common symbols in module loader

2014-02-13 Thread tip-bot for Joe Mario
Commit-ID: 80375980f1608f43b47abc2671456b23ec68c434 Gitweb: http://git.kernel.org/tip/80375980f1608f43b47abc2671456b23ec68c434 Author: Joe Mario jma...@redhat.com AuthorDate: Sat, 8 Feb 2014 09:01:09 +0100 Committer: H. Peter Anvin h...@linux.intel.com CommitDate: Thu, 13 Feb 2014 20:24

[tip:x86/asmlinkage] lto: Handle LTO common symbols in module loader

2014-02-13 Thread tip-bot for Joe Mario
Commit-ID: 80375980f1608f43b47abc2671456b23ec68c434 Gitweb: http://git.kernel.org/tip/80375980f1608f43b47abc2671456b23ec68c434 Author: Joe Mario AuthorDate: Sat, 8 Feb 2014 09:01:09 +0100 Committer: H. Peter Anvin CommitDate: Thu, 13 Feb 2014 20:24:50 -0800 lto: Handle LTO common