Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-13 Thread Stephane Eranian
On Thu, Feb 13, 2014 at 2:02 PM, Jiri Olsa wrote: > On Tue, Feb 11, 2014 at 08:50:13AM -0300, Arnaldo Carvalho de Melo wrote: >> Em Tue, Feb 11, 2014 at 12:14:21PM +0100, Peter Zijlstra escreveu: >> > On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: >> > > Assuming you can decode

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-13 Thread Jiri Olsa
On Tue, Feb 11, 2014 at 08:50:13AM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Feb 11, 2014 at 12:14:21PM +0100, Peter Zijlstra escreveu: > > On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: > > > Assuming you can decode and get the info about the base registers used, > > >

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-13 Thread Jiri Olsa
On Tue, Feb 11, 2014 at 08:50:13AM -0300, Arnaldo Carvalho de Melo wrote: Em Tue, Feb 11, 2014 at 12:14:21PM +0100, Peter Zijlstra escreveu: On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you can decode and get the info about the base registers used, you'd have

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-13 Thread Stephane Eranian
On Thu, Feb 13, 2014 at 2:02 PM, Jiri Olsa jo...@redhat.com wrote: On Tue, Feb 11, 2014 at 08:50:13AM -0300, Arnaldo Carvalho de Melo wrote: Em Tue, Feb 11, 2014 at 12:14:21PM +0100, Peter Zijlstra escreveu: On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you can

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 08:50:13AM -0300, Arnaldo Carvalho de Melo wrote: > 3) PERF_SAMPLE_REGS_USER (from a quick look, why do we have "USER" in > it? Jiri?) Note that the regs are in the POST instruction state, so any op that does something like: MOV %edx, $(eax+edx*8) Will have lost the

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:31:49PM +0100, Peter Zijlstra wrote: > On Tue, Feb 11, 2014 at 12:28:47PM +0100, Stephane Eranian wrote: > > On Tue, Feb 11, 2014 at 12:14 PM, Peter Zijlstra > > wrote: > > > On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: > > >> Assuming you can

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Arnaldo Carvalho de Melo
Em Tue, Feb 11, 2014 at 12:14:21PM +0100, Peter Zijlstra escreveu: > On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: > > Assuming you can decode and get the info about the base registers used, > > you'd have to do this for each arch with load/store sampling capabilities. > > this

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:28:47PM +0100, Stephane Eranian wrote: > On Tue, Feb 11, 2014 at 12:14 PM, Peter Zijlstra wrote: > > On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: > >> Assuming you can decode and get the info about the base registers used, > >> you'd have to do this

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 12:14 PM, Peter Zijlstra wrote: > On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: >> Assuming you can decode and get the info about the base registers used, >> you'd have to do this for each arch with load/store sampling capabilities. >> this is painful

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: > Assuming you can decode and get the info about the base registers used, > you'd have to do this for each arch with load/store sampling capabilities. > this is painful compared to getting the portable info from dwarf directly. But

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 12:04 PM, Stephane Eranian wrote: > On Tue, Feb 11, 2014 at 12:02 PM, Peter Zijlstra wrote: >> On Tue, Feb 11, 2014 at 11:58:45AM +0100, Stephane Eranian wrote: >>> On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra >>> wrote: >>> > On Tue, Feb 11, 2014 at 11:35:45AM

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:04:23PM +0100, Stephane Eranian wrote: > >> How do you know that load at addr 0x1000 is accessing variable bar? > >> The IP gives you line number, and then what? > >> I think dwarf has the mapping regs -> variable and yes, the type info. > >> But I am not sure that's

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 12:02 PM, Peter Zijlstra wrote: > On Tue, Feb 11, 2014 at 11:58:45AM +0100, Stephane Eranian wrote: >> On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra >> wrote: >> > On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: >> >> On Tue, Feb 11, 2014 at 8:14 AM,

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 11:58:45AM +0100, Stephane Eranian wrote: > On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra wrote: > > On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: > >> On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra > >> wrote: > >> > > >> > That blows; how much

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra wrote: > On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: >> On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra wrote: >> > >> > That blows; how much is missing? >> >> They need to annotate load and stores. I asked for that feature a

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: > On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra wrote: > > > > That blows; how much is missing? > > They need to annotate load and stores. I asked for that feature a while ago. > It will come. And there is no way to deduce the

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra wrote: > On Mon, Feb 10, 2014 at 11:21:53PM +0100, Stephane Eranian wrote: >> On Mon, Feb 10, 2014 at 10:29 PM, Peter Zijlstra >> wrote: >> > On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: >> >> The data output is verbose and there

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra pet...@infradead.org wrote: On Mon, Feb 10, 2014 at 11:21:53PM +0100, Stephane Eranian wrote: On Mon, Feb 10, 2014 at 10:29 PM, Peter Zijlstra pet...@infradead.org wrote: On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: The data

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra pet...@infradead.org wrote: That blows; how much is missing? They need to annotate load and stores. I asked for that feature a while ago. It will come. And there is no way to

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra pet...@infradead.org wrote: That blows; how much is missing? They need to annotate load and stores.

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 11:58:45AM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 8:14 AM, Peter Zijlstra pet...@infradead.org wrote:

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 12:02 PM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 11:58:45AM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 11:35:45AM +0100, Stephane Eranian wrote: On

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:04:23PM +0100, Stephane Eranian wrote: How do you know that load at addr 0x1000 is accessing variable bar? The IP gives you line number, and then what? I think dwarf has the mapping regs - variable and yes, the type info. But I am not sure that's enough. Ah,

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 12:04 PM, Stephane Eranian eran...@google.com wrote: On Tue, Feb 11, 2014 at 12:02 PM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 11:58:45AM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 11:52 AM, Peter Zijlstra pet...@infradead.org

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you can decode and get the info about the base registers used, you'd have to do this for each arch with load/store sampling capabilities. this is painful compared to getting the portable info from dwarf directly. But

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Stephane Eranian
On Tue, Feb 11, 2014 at 12:14 PM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you can decode and get the info about the base registers used, you'd have to do this for each arch with load/store sampling capabilities. this

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:28:47PM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 12:14 PM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you can decode and get the info about the base registers used, you'd have

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Arnaldo Carvalho de Melo
Em Tue, Feb 11, 2014 at 12:14:21PM +0100, Peter Zijlstra escreveu: On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you can decode and get the info about the base registers used, you'd have to do this for each arch with load/store sampling capabilities. this is

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 12:31:49PM +0100, Peter Zijlstra wrote: On Tue, Feb 11, 2014 at 12:28:47PM +0100, Stephane Eranian wrote: On Tue, Feb 11, 2014 at 12:14 PM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Feb 11, 2014 at 12:08:56PM +0100, Stephane Eranian wrote: Assuming you

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-11 Thread Peter Zijlstra
On Tue, Feb 11, 2014 at 08:50:13AM -0300, Arnaldo Carvalho de Melo wrote: 3) PERF_SAMPLE_REGS_USER (from a quick look, why do we have USER in it? Jiri?) Note that the regs are in the POST instruction state, so any op that does something like: MOV %edx, $(eax+edx*8) Will have lost the

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 11:21:53PM +0100, Stephane Eranian wrote: > On Mon, Feb 10, 2014 at 10:29 PM, Peter Zijlstra wrote: > > On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: > >> The data output is verbose and there are lots of data tables that > >> interprit the latencies > >> and

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Stephane Eranian
On Mon, Feb 10, 2014 at 10:29 PM, Peter Zijlstra wrote: > On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: >> The data output is verbose and there are lots of data tables that interprit >> the latencies >> and data addresses in different ways to help see where bottlenecks might be >>

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
On Mon, Feb 10, 2014 at 10:29:55PM +0100, Peter Zijlstra wrote: > On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: > > The data output is verbose and there are lots of data tables that interprit > > the latencies > > and data addresses in different ways to help see where bottlenecks

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
On Mon, Feb 10, 2014 at 10:18:25PM +0100, Peter Zijlstra wrote: > On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: > > With the introduction of NUMA systems, came the possibility of remote > > memory accesses. > > Combine those remote memory accesses with contention on the remote node

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: > The data output is verbose and there are lots of data tables that interprit > the latencies > and data addresses in different ways to help see where bottlenecks might be > lying. Would be good to see what the output looks like. What

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: > With the introduction of NUMA systems, came the possibility of remote memory > accesses. > Combine those remote memory accesses with contention on the remote node (ie a > modified > cacheline) and you have a possibility for very long

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
On Mon, Feb 10, 2014 at 10:59:30AM -0800, Davidlohr Bueso wrote: > This can be really useful for us performance folks, thanks. It seems > however that the first two patches in the series are missing. Odd, yes. For some reason they cc'd to me fine, just never made it to lkml. Let me resend them.

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Davidlohr Bueso
This can be really useful for us performance folks, thanks. It seems however that the first two patches in the series are missing. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at

[PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
With the introduction of NUMA systems, came the possibility of remote memory accesses. Combine those remote memory accesses with contention on the remote node (ie a modified cacheline) and you have a possibility for very long latencies. These latencies can bottleneck a program. The program

[PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
With the introduction of NUMA systems, came the possibility of remote memory accesses. Combine those remote memory accesses with contention on the remote node (ie a modified cacheline) and you have a possibility for very long latencies. These latencies can bottleneck a program. The program

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Davidlohr Bueso
This can be really useful for us performance folks, thanks. It seems however that the first two patches in the series are missing. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
On Mon, Feb 10, 2014 at 10:59:30AM -0800, Davidlohr Bueso wrote: This can be really useful for us performance folks, thanks. It seems however that the first two patches in the series are missing. Odd, yes. For some reason they cc'd to me fine, just never made it to lkml. Let me resend them.

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: With the introduction of NUMA systems, came the possibility of remote memory accesses. Combine those remote memory accesses with contention on the remote node (ie a modified cacheline) and you have a possibility for very long

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: The data output is verbose and there are lots of data tables that interprit the latencies and data addresses in different ways to help see where bottlenecks might be lying. Would be good to see what the output looks like. What I

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
On Mon, Feb 10, 2014 at 10:18:25PM +0100, Peter Zijlstra wrote: On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: With the introduction of NUMA systems, came the possibility of remote memory accesses. Combine those remote memory accesses with contention on the remote node (ie

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Don Zickus
On Mon, Feb 10, 2014 at 10:29:55PM +0100, Peter Zijlstra wrote: On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: The data output is verbose and there are lots of data tables that interprit the latencies and data addresses in different ways to help see where bottlenecks might be

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Stephane Eranian
On Mon, Feb 10, 2014 at 10:29 PM, Peter Zijlstra pet...@infradead.org wrote: On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: The data output is verbose and there are lots of data tables that interprit the latencies and data addresses in different ways to help see where bottlenecks

Re: [PATCH 00/21] perf, c2c: Add new tool to analyze cacheline contention on NUMA systems

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 11:21:53PM +0100, Stephane Eranian wrote: On Mon, Feb 10, 2014 at 10:29 PM, Peter Zijlstra pet...@infradead.org wrote: On Mon, Feb 10, 2014 at 12:28:55PM -0500, Don Zickus wrote: The data output is verbose and there are lots of data tables that interprit the