Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Andi Kleen
> LBR callstack fails for leaf function optimization. Where the callee does > not return to its caller but instead to the caller's caller. That is the one > case I know about. There are others I believe. No it should work fine for this case. You just don't see the tail call, but the call stack

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Andi Kleen
On Wed, Nov 05, 2014 at 05:29:32PM +0100, Peter Zijlstra wrote: > On Wed, Nov 05, 2014 at 03:53:34PM +, Liang, Kan wrote: > > > I don't think it would be very hard to modify the patch set to make that > > > 3rd > > > mode visible. Just need to make that new PERF_RECORD_* type visible to > > >

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Andi Kleen
> The reason why using LBR call stack mode is restricted to user level > only is because of > a bug in the LBR call stack hardware which forces the kernel to drop > LBR_FREEZE_PMI. It works with PEBS events, just not with non PEBS (the patch currently does not implement this distinction and I'm

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 03:53:34PM +, Liang, Kan wrote: > > I don't think it would be very hard to modify the patch set to make that 3rd > > mode visible. Just need to make that new PERF_RECORD_* type visible to > > user and modify the compatibility checks. > > It's not hard. But LBR is not

RE: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Liang, Kan
Thanks for your comments. There are lots of discussion about the patch. It's hard to reply them one by one. So I try to reply all the concerns here. The patchset doesn't try to introduce the 3rd independent callchain option That's because LBR callstack has some limitations (only available for

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 02:22:07PM +0100, Stephane Eranian wrote: > I tend to agree here. The problem with FP is that it is not easy to figure > out how a binary has been compiled. Getting valid FP callchains for > large binaries using lots of shared libraries is very challenging. All > libraries

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Stephane Eranian
On Wed, Nov 5, 2014 at 1:49 PM, Peter Zijlstra wrote: > On Wed, Nov 05, 2014 at 11:57:10AM +0100, Stephane Eranian wrote: >> Yes, but I wonder how would the tool sort this out if you have FP and LBR >> for each sample. > > That's the tools 'problem'. It currently can already have FP and Dwarf >

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 11:57:10AM +0100, Stephane Eranian wrote: > Yes, but I wonder how would the tool sort this out if you have FP and LBR > for each sample. That's the tools 'problem'. It currently can already have FP and Dwarf bits. And it does not need to request all of them. > My

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Stephane Eranian
On Wed, Nov 5, 2014 at 11:43 AM, Peter Zijlstra wrote: > On Wed, Nov 05, 2014 at 10:58:28AM +0100, Stephane Eranian wrote: >> On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra wrote: >> > On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: >> >> From: Yan, Zheng >> >> >> >> Only enable LBR

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 10:58:28AM +0100, Stephane Eranian wrote: > On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra wrote: > > On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: > >> From: Yan, Zheng > >> > >> Only enable LBR callstack when user requires fp callgraph. The feature > >> is

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Stephane Eranian
On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra wrote: > On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: >> From: Yan, Zheng >> >> Only enable LBR callstack when user requires fp callgraph. The feature >> is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER >> is

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: > From: Yan, Zheng > > Only enable LBR callstack when user requires fp callgraph. The feature > is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER > is required. > Also, this feature only affects how to get user

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: From: Yan, Zheng zheng.z@intel.com Only enable LBR callstack when user requires fp callgraph. The feature is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER is required. Also, this feature only affects how

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Stephane Eranian
On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: From: Yan, Zheng zheng.z@intel.com Only enable LBR callstack when user requires fp callgraph. The feature is not available when PERF_SAMPLE_BRANCH_STACK

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 10:58:28AM +0100, Stephane Eranian wrote: On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: From: Yan, Zheng zheng.z@intel.com Only enable LBR callstack when user requires fp

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Stephane Eranian
On Wed, Nov 5, 2014 at 11:43 AM, Peter Zijlstra pet...@infradead.org wrote: On Wed, Nov 05, 2014 at 10:58:28AM +0100, Stephane Eranian wrote: On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote: From: Yan,

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 11:57:10AM +0100, Stephane Eranian wrote: Yes, but I wonder how would the tool sort this out if you have FP and LBR for each sample. That's the tools 'problem'. It currently can already have FP and Dwarf bits. And it does not need to request all of them. My

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Stephane Eranian
On Wed, Nov 5, 2014 at 1:49 PM, Peter Zijlstra pet...@infradead.org wrote: On Wed, Nov 05, 2014 at 11:57:10AM +0100, Stephane Eranian wrote: Yes, but I wonder how would the tool sort this out if you have FP and LBR for each sample. That's the tools 'problem'. It currently can already have FP

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 02:22:07PM +0100, Stephane Eranian wrote: I tend to agree here. The problem with FP is that it is not easy to figure out how a binary has been compiled. Getting valid FP callchains for large binaries using lots of shared libraries is very challenging. All libraries must

RE: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Liang, Kan
Thanks for your comments. There are lots of discussion about the patch. It's hard to reply them one by one. So I try to reply all the concerns here. The patchset doesn't try to introduce the 3rd independent callchain option That's because LBR callstack has some limitations (only available for

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Peter Zijlstra
On Wed, Nov 05, 2014 at 03:53:34PM +, Liang, Kan wrote: I don't think it would be very hard to modify the patch set to make that 3rd mode visible. Just need to make that new PERF_RECORD_* type visible to user and modify the compatibility checks. It's not hard. But LBR is not an

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Andi Kleen
The reason why using LBR call stack mode is restricted to user level only is because of a bug in the LBR call stack hardware which forces the kernel to drop LBR_FREEZE_PMI. It works with PEBS events, just not with non PEBS (the patch currently does not implement this distinction and I'm not

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Andi Kleen
On Wed, Nov 05, 2014 at 05:29:32PM +0100, Peter Zijlstra wrote: On Wed, Nov 05, 2014 at 03:53:34PM +, Liang, Kan wrote: I don't think it would be very hard to modify the patch set to make that 3rd mode visible. Just need to make that new PERF_RECORD_* type visible to user and

Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-05 Thread Andi Kleen
LBR callstack fails for leaf function optimization. Where the callee does not return to its caller but instead to the caller's caller. That is the one case I know about. There are others I believe. No it should work fine for this case. You just don't see the tail call, but the call stack does

[PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-04 Thread Kan Liang
From: Yan, Zheng Only enable LBR callstack when user requires fp callgraph. The feature is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER is required. Also, this feature only affects how to get user callchain. The kernel callchain is always got by frame pointers.

[PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

2014-11-04 Thread Kan Liang
From: Yan, Zheng zheng.z@intel.com Only enable LBR callstack when user requires fp callgraph. The feature is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER is required. Also, this feature only affects how to get user callchain. The kernel callchain is always got by