Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Wed, Apr 09, 2014 at 02:21:49PM +0900, Namhyung Kim wrote: > > create a new 'physid mode' to group all the sorting rules together > > (mimics the mem-mode) > > What is 'physid' then? I guess you meant physical id but it seems > unique id or unique map id looks like a better fit IMHO. I suspect this is legacy naming; they used to do this using physical addresses. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Mon, 24 Mar 2014 16:57:18 -0400, Don Zickus wrote: > In order for the c2c tool to work correctly, it needs to properly > sort all the records on uniquely identifiable data addresses. These > unique addresses are converted from virtual addresses provided by the > hardware into a kernel address using an mmap2 record as the decoder. > > Once a unique address is converted, we can sort on them based on > various rules. Then it becomes clear which address are overlapping > with each other across mmap regions or pid spaces. > > This patch just creates the rules and inserts the records into a > sort entry for safe keeping until later patches process them. > > The general sorting rule is: > > o group cpumodes together > o if (nonzero major/minor number - ie mmap'd areas) > o sort on major, minor, inode, inode generation numbers > o else if cpumode is not kernel > o sort on pid > o sort on data addresses > > I also hacked in the concept of 'color'. The purpose of that bit is to > provides hints later when processing these records that indicate a new unique > address has been encountered. Because later processing only checks the data > addresses, there can be a theoretical scenario that similar sequential data > addresses (when walking the rbtree) could be misinterpreted as overlapping > when in fact they are not. > > Sample output: (perf report --stdio --physid-mode) > > OverheadData AddressSource AddressCommand: Pid >Tid Major Minor Inode Inode Gen > .. > . . . . ... . > 18.93% [k] 0xc900139c40b0 [k] igb_update_stats kworker/0:1: > 257 257 0 0 0 0 > 7.63% [k] 0x88082e6cf0a8 [k] watchdog_timer_fnswapper: > 0 0 0 0 0 0 > 1.86% [k] 0x88042ef94700 [k] _raw_spin_lock swapper: > 0 0 0 0 0 0 > 1.77% [k] 0x8804278afa50 [k] __switch_to swapper: > 0 0 0 0 0 0 > > V4: add manpage entry in perf-report > > V3: split out the sorting into unique entries. This makes it look > far less ugly > create a new 'physid mode' to group all the sorting rules together > (mimics the mem-mode) What is 'physid' then? I guess you meant physical id but it seems unique id or unique map id looks like a better fit IMHO. > > Signed-off-by: Don Zickus > --- > tools/perf/Documentation/perf-report.txt | 23 +++ > tools/perf/builtin-report.c | 20 ++- > tools/perf/util/hist.c | 27 ++- > tools/perf/util/hist.h | 8 + > tools/perf/util/sort.c | 294 > +++ > tools/perf/util/sort.h | 13 ++ > 6 files changed, 381 insertions(+), 4 deletions(-) > > diff --git a/tools/perf/Documentation/perf-report.txt > b/tools/perf/Documentation/perf-report.txt > index 8eab8a4..01391b0 100644 > --- a/tools/perf/Documentation/perf-report.txt > +++ b/tools/perf/Documentation/perf-report.txt > @@ -95,6 +95,23 @@ OPTIONS > And default sort keys are changed to comm, dso_from, symbol_from, dso_to > and symbol_to, see '--branch-stack'. > > + If --physid-mode option is used, following sort keys are also > + available: > + daddr, iaddr, pid, tid, major, minor, inode, inode_gen. > + > + - daddr: data address (sorted based on major, minor, inode and inode > + generation numbers if shared, otherwise pid) By "if shared", did you mean "for shared file mapping"? > + - iaddr: instruction address > + - pid: command and pid of the task > + - tid: tid of the task > + - major: major number of mapped location (0 if not mapped) > + - minor: minor number of mapped location (0 if not mapped) > + - inode: inode number of mapped location (0 if not mapped) > + - inode_gen: inode generation number of mapped location (0 if not > mapped) s/if not mapped/if not file-mapped/ ? > + > + And default sort keys are changed to daddr, iaddr, pid, tid, major, > + minor, inode and inode_gen, see '--physid-mode'. > + > -p:: > --parent=:: > A regex filter to identify parent. The parent is a caller of this > @@ -223,6 +240,12 @@ OPTIONS > branch stacks and it will automatically switch to the branch view mode, > unless --no-branch-stack is used. > > +--physid-mode:: > + Use the data addresses sampled using perf record -d and combine them > + with the mmap'd area region where they are located. This helps identify > + which data addresses collide with similar addresses in another process > + space. See --sort for output choices. > + > --objdump=:: > Path to objdump binary. > > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > index c87412b..093f5ad 100644 > ---
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Mon, 24 Mar 2014 16:57:18 -0400, Don Zickus wrote: In order for the c2c tool to work correctly, it needs to properly sort all the records on uniquely identifiable data addresses. These unique addresses are converted from virtual addresses provided by the hardware into a kernel address using an mmap2 record as the decoder. Once a unique address is converted, we can sort on them based on various rules. Then it becomes clear which address are overlapping with each other across mmap regions or pid spaces. This patch just creates the rules and inserts the records into a sort entry for safe keeping until later patches process them. The general sorting rule is: o group cpumodes together o if (nonzero major/minor number - ie mmap'd areas) o sort on major, minor, inode, inode generation numbers o else if cpumode is not kernel o sort on pid o sort on data addresses I also hacked in the concept of 'color'. The purpose of that bit is to provides hints later when processing these records that indicate a new unique address has been encountered. Because later processing only checks the data addresses, there can be a theoretical scenario that similar sequential data addresses (when walking the rbtree) could be misinterpreted as overlapping when in fact they are not. Sample output: (perf report --stdio --physid-mode) OverheadData AddressSource AddressCommand: Pid Tid Major Minor Inode Inode Gen .. . . . . ... . 18.93% [k] 0xc900139c40b0 [k] igb_update_stats kworker/0:1: 257 257 0 0 0 0 7.63% [k] 0x88082e6cf0a8 [k] watchdog_timer_fnswapper: 0 0 0 0 0 0 1.86% [k] 0x88042ef94700 [k] _raw_spin_lock swapper: 0 0 0 0 0 0 1.77% [k] 0x8804278afa50 [k] __switch_to swapper: 0 0 0 0 0 0 V4: add manpage entry in perf-report V3: split out the sorting into unique entries. This makes it look far less ugly create a new 'physid mode' to group all the sorting rules together (mimics the mem-mode) What is 'physid' then? I guess you meant physical id but it seems unique id or unique map id looks like a better fit IMHO. Signed-off-by: Don Zickus dzic...@redhat.com --- tools/perf/Documentation/perf-report.txt | 23 +++ tools/perf/builtin-report.c | 20 ++- tools/perf/util/hist.c | 27 ++- tools/perf/util/hist.h | 8 + tools/perf/util/sort.c | 294 +++ tools/perf/util/sort.h | 13 ++ 6 files changed, 381 insertions(+), 4 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 8eab8a4..01391b0 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -95,6 +95,23 @@ OPTIONS And default sort keys are changed to comm, dso_from, symbol_from, dso_to and symbol_to, see '--branch-stack'. + If --physid-mode option is used, following sort keys are also + available: + daddr, iaddr, pid, tid, major, minor, inode, inode_gen. + + - daddr: data address (sorted based on major, minor, inode and inode + generation numbers if shared, otherwise pid) By if shared, did you mean for shared file mapping? + - iaddr: instruction address + - pid: command and pid of the task + - tid: tid of the task + - major: major number of mapped location (0 if not mapped) + - minor: minor number of mapped location (0 if not mapped) + - inode: inode number of mapped location (0 if not mapped) + - inode_gen: inode generation number of mapped location (0 if not mapped) s/if not mapped/if not file-mapped/ ? + + And default sort keys are changed to daddr, iaddr, pid, tid, major, + minor, inode and inode_gen, see '--physid-mode'. + -p:: --parent=regex:: A regex filter to identify parent. The parent is a caller of this @@ -223,6 +240,12 @@ OPTIONS branch stacks and it will automatically switch to the branch view mode, unless --no-branch-stack is used. +--physid-mode:: + Use the data addresses sampled using perf record -d and combine them + with the mmap'd area region where they are located. This helps identify + which data addresses collide with similar addresses in another process + space. See --sort for output choices. + --objdump=path:: Path to objdump binary. diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index c87412b..093f5ad 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -49,6 +49,7 @@ struct report {
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Wed, Apr 09, 2014 at 02:21:49PM +0900, Namhyung Kim wrote: create a new 'physid mode' to group all the sorting rules together (mimics the mem-mode) What is 'physid' then? I guess you meant physical id but it seems unique id or unique map id looks like a better fit IMHO. I suspect this is legacy naming; they used to do this using physical addresses. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Sat, Mar 29, 2014 at 06:11:52PM +0100, Jiri Olsa wrote: > On Mon, Mar 24, 2014 at 04:57:18PM -0400, Don Zickus wrote: > > In order for the c2c tool to work correctly, it needs to properly > > sort all the records on uniquely identifiable data addresses. These > > unique addresses are converted from virtual addresses provided by the > > hardware into a kernel address using an mmap2 record as the decoder. > > > > Once a unique address is converted, we can sort on them based on > > various rules. Then it becomes clear which address are overlapping > > with each other across mmap regions or pid spaces. > > > > This patch just creates the rules and inserts the records into a > > sort entry for safe keeping until later patches process them. > > > > The general sorting rule is: > > SNIP > > > + > > +static int64_t > > +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right) > > +{ > > + struct map *l = left->mem_info->daddr.map; > > + struct map *r = right->mem_info->daddr.map; > > + > > + return r->maj - l->maj; > > I got segfault here, and consequently in all other sorting > functions, because it failed to resolve map earlier in > ip__resolve_data > > we need to check it here, or before adding to the tree Crap. I checked it before, when I had one big function. I forgot to carry that though. Honestly I would love to block these before they made it to the sort routine but don't know a good way without adding checks to all the builtins. Cheers, Don -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Sat, Mar 29, 2014 at 06:11:52PM +0100, Jiri Olsa wrote: On Mon, Mar 24, 2014 at 04:57:18PM -0400, Don Zickus wrote: In order for the c2c tool to work correctly, it needs to properly sort all the records on uniquely identifiable data addresses. These unique addresses are converted from virtual addresses provided by the hardware into a kernel address using an mmap2 record as the decoder. Once a unique address is converted, we can sort on them based on various rules. Then it becomes clear which address are overlapping with each other across mmap regions or pid spaces. This patch just creates the rules and inserts the records into a sort entry for safe keeping until later patches process them. The general sorting rule is: SNIP + +static int64_t +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right) +{ + struct map *l = left-mem_info-daddr.map; + struct map *r = right-mem_info-daddr.map; + + return r-maj - l-maj; I got segfault here, and consequently in all other sorting functions, because it failed to resolve map earlier in ip__resolve_data we need to check it here, or before adding to the tree Crap. I checked it before, when I had one big function. I forgot to carry that though. Honestly I would love to block these before they made it to the sort routine but don't know a good way without adding checks to all the builtins. Cheers, Don -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Mon, Mar 24, 2014 at 04:57:18PM -0400, Don Zickus wrote: > In order for the c2c tool to work correctly, it needs to properly > sort all the records on uniquely identifiable data addresses. These > unique addresses are converted from virtual addresses provided by the > hardware into a kernel address using an mmap2 record as the decoder. > > Once a unique address is converted, we can sort on them based on > various rules. Then it becomes clear which address are overlapping > with each other across mmap regions or pid spaces. > > This patch just creates the rules and inserts the records into a > sort entry for safe keeping until later patches process them. > > The general sorting rule is: SNIP > + > +static int64_t > +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right) > +{ > + struct map *l = left->mem_info->daddr.map; > + struct map *r = right->mem_info->daddr.map; > + > + return r->maj - l->maj; I got segfault here, and consequently in all other sorting functions, because it failed to resolve map earlier in ip__resolve_data we need to check it here, or before adding to the tree thanks, jirka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
On Mon, Mar 24, 2014 at 04:57:18PM -0400, Don Zickus wrote: In order for the c2c tool to work correctly, it needs to properly sort all the records on uniquely identifiable data addresses. These unique addresses are converted from virtual addresses provided by the hardware into a kernel address using an mmap2 record as the decoder. Once a unique address is converted, we can sort on them based on various rules. Then it becomes clear which address are overlapping with each other across mmap regions or pid spaces. This patch just creates the rules and inserts the records into a sort entry for safe keeping until later patches process them. The general sorting rule is: SNIP + +static int64_t +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right) +{ + struct map *l = left-mem_info-daddr.map; + struct map *r = right-mem_info-daddr.map; + + return r-maj - l-maj; I got segfault here, and consequently in all other sorting functions, because it failed to resolve map earlier in ip__resolve_data we need to check it here, or before adding to the tree thanks, jirka -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
In order for the c2c tool to work correctly, it needs to properly sort all the records on uniquely identifiable data addresses. These unique addresses are converted from virtual addresses provided by the hardware into a kernel address using an mmap2 record as the decoder. Once a unique address is converted, we can sort on them based on various rules. Then it becomes clear which address are overlapping with each other across mmap regions or pid spaces. This patch just creates the rules and inserts the records into a sort entry for safe keeping until later patches process them. The general sorting rule is: o group cpumodes together o if (nonzero major/minor number - ie mmap'd areas) o sort on major, minor, inode, inode generation numbers o else if cpumode is not kernel o sort on pid o sort on data addresses I also hacked in the concept of 'color'. The purpose of that bit is to provides hints later when processing these records that indicate a new unique address has been encountered. Because later processing only checks the data addresses, there can be a theoretical scenario that similar sequential data addresses (when walking the rbtree) could be misinterpreted as overlapping when in fact they are not. Sample output: (perf report --stdio --physid-mode) OverheadData AddressSource AddressCommand: Pid Tid Major Minor Inode Inode Gen .. . . . . ... . 18.93% [k] 0xc900139c40b0 [k] igb_update_stats kworker/0:1: 257 257 0 0 0 0 7.63% [k] 0x88082e6cf0a8 [k] watchdog_timer_fnswapper:0 0 0 0 0 0 1.86% [k] 0x88042ef94700 [k] _raw_spin_lock swapper:0 0 0 0 0 0 1.77% [k] 0x8804278afa50 [k] __switch_to swapper:0 0 0 0 0 0 V4: add manpage entry in perf-report V3: split out the sorting into unique entries. This makes it look far less ugly create a new 'physid mode' to group all the sorting rules together (mimics the mem-mode) Signed-off-by: Don Zickus --- tools/perf/Documentation/perf-report.txt | 23 +++ tools/perf/builtin-report.c | 20 ++- tools/perf/util/hist.c | 27 ++- tools/perf/util/hist.h | 8 + tools/perf/util/sort.c | 294 +++ tools/perf/util/sort.h | 13 ++ 6 files changed, 381 insertions(+), 4 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 8eab8a4..01391b0 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -95,6 +95,23 @@ OPTIONS And default sort keys are changed to comm, dso_from, symbol_from, dso_to and symbol_to, see '--branch-stack'. + If --physid-mode option is used, following sort keys are also + available: + daddr, iaddr, pid, tid, major, minor, inode, inode_gen. + + - daddr: data address (sorted based on major, minor, inode and inode + generation numbers if shared, otherwise pid) + - iaddr: instruction address + - pid: command and pid of the task + - tid: tid of the task + - major: major number of mapped location (0 if not mapped) + - minor: minor number of mapped location (0 if not mapped) + - inode: inode number of mapped location (0 if not mapped) + - inode_gen: inode generation number of mapped location (0 if not mapped) + + And default sort keys are changed to daddr, iaddr, pid, tid, major, + minor, inode and inode_gen, see '--physid-mode'. + -p:: --parent=:: A regex filter to identify parent. The parent is a caller of this @@ -223,6 +240,12 @@ OPTIONS branch stacks and it will automatically switch to the branch view mode, unless --no-branch-stack is used. +--physid-mode:: + Use the data addresses sampled using perf record -d and combine them + with the mmap'd area region where they are located. This helps identify + which data addresses collide with similar addresses in another process + space. See --sort for output choices. + --objdump=:: Path to objdump binary. diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index c87412b..093f5ad 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -49,6 +49,7 @@ struct report { boolshow_threads; boolinverted_callchain; boolmem_mode; + boolphysid_mode; boolheader; boolheader_only; int max_stack; @@ -241,7 +242,7 @@ static int
[PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
In order for the c2c tool to work correctly, it needs to properly sort all the records on uniquely identifiable data addresses. These unique addresses are converted from virtual addresses provided by the hardware into a kernel address using an mmap2 record as the decoder. Once a unique address is converted, we can sort on them based on various rules. Then it becomes clear which address are overlapping with each other across mmap regions or pid spaces. This patch just creates the rules and inserts the records into a sort entry for safe keeping until later patches process them. The general sorting rule is: o group cpumodes together o if (nonzero major/minor number - ie mmap'd areas) o sort on major, minor, inode, inode generation numbers o else if cpumode is not kernel o sort on pid o sort on data addresses I also hacked in the concept of 'color'. The purpose of that bit is to provides hints later when processing these records that indicate a new unique address has been encountered. Because later processing only checks the data addresses, there can be a theoretical scenario that similar sequential data addresses (when walking the rbtree) could be misinterpreted as overlapping when in fact they are not. Sample output: (perf report --stdio --physid-mode) OverheadData AddressSource AddressCommand: Pid Tid Major Minor Inode Inode Gen .. . . . . ... . 18.93% [k] 0xc900139c40b0 [k] igb_update_stats kworker/0:1: 257 257 0 0 0 0 7.63% [k] 0x88082e6cf0a8 [k] watchdog_timer_fnswapper:0 0 0 0 0 0 1.86% [k] 0x88042ef94700 [k] _raw_spin_lock swapper:0 0 0 0 0 0 1.77% [k] 0x8804278afa50 [k] __switch_to swapper:0 0 0 0 0 0 V4: add manpage entry in perf-report V3: split out the sorting into unique entries. This makes it look far less ugly create a new 'physid mode' to group all the sorting rules together (mimics the mem-mode) Signed-off-by: Don Zickus dzic...@redhat.com --- tools/perf/Documentation/perf-report.txt | 23 +++ tools/perf/builtin-report.c | 20 ++- tools/perf/util/hist.c | 27 ++- tools/perf/util/hist.h | 8 + tools/perf/util/sort.c | 294 +++ tools/perf/util/sort.h | 13 ++ 6 files changed, 381 insertions(+), 4 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 8eab8a4..01391b0 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -95,6 +95,23 @@ OPTIONS And default sort keys are changed to comm, dso_from, symbol_from, dso_to and symbol_to, see '--branch-stack'. + If --physid-mode option is used, following sort keys are also + available: + daddr, iaddr, pid, tid, major, minor, inode, inode_gen. + + - daddr: data address (sorted based on major, minor, inode and inode + generation numbers if shared, otherwise pid) + - iaddr: instruction address + - pid: command and pid of the task + - tid: tid of the task + - major: major number of mapped location (0 if not mapped) + - minor: minor number of mapped location (0 if not mapped) + - inode: inode number of mapped location (0 if not mapped) + - inode_gen: inode generation number of mapped location (0 if not mapped) + + And default sort keys are changed to daddr, iaddr, pid, tid, major, + minor, inode and inode_gen, see '--physid-mode'. + -p:: --parent=regex:: A regex filter to identify parent. The parent is a caller of this @@ -223,6 +240,12 @@ OPTIONS branch stacks and it will automatically switch to the branch view mode, unless --no-branch-stack is used. +--physid-mode:: + Use the data addresses sampled using perf record -d and combine them + with the mmap'd area region where they are located. This helps identify + which data addresses collide with similar addresses in another process + space. See --sort for output choices. + --objdump=path:: Path to objdump binary. diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index c87412b..093f5ad 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -49,6 +49,7 @@ struct report { boolshow_threads; boolinverted_callchain; boolmem_mode; + boolphysid_mode; boolheader; boolheader_only; int max_stack; @@ -241,7 +242,7 @@