Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-10-09 Thread Anshuman Khandual
On 09/26/2013 04:44 PM, Stephane Eranian wrote:
> So you are saying that the HW filter is exclusive. That seems odd. But
> I think it is
> because of the choices is ANY. ANY covers all the types of branches. Therefore
> it does not make a difference whether you add COND or not. And
> vice-versa, if you
> set COND, you need to disable ANY. I bet if you add other filters such
> as CALL, RETURN,
> then you could OR them and say: I want RETURN or CALLS.
> 
> But that's okay. The API operates in OR mode but if the HW does not
> support it, you
> can check the mask and reject if more than one type is set. That is
> arch-specific code.
> The alternative, if to only capture ANY and emulate the filter in SW.
> This will work, of
> course. But the downside, is that you lose the way to appreciate how
> many, for instance,
> COND branches you sampled out of the total number of COND branches
> retired. Unless
> you can count COND branches separately.

Hey Stephane,

Thanks for your reply. I am working on a solution where PMU will process
all the requested branch filters in HW only if it can filter all of them in an
OR manner else it will just leave the entire thing upto the SW to process and
do no filtering itself. This implies that branch filtering will either happen
completely in HW or completely in SW and never in a mixed manner. This way
it will conform to the OR mode defined in the API. I will post the revised
patch set soon.

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-10-09 Thread Anshuman Khandual
On 09/26/2013 04:44 PM, Stephane Eranian wrote:
 So you are saying that the HW filter is exclusive. That seems odd. But
 I think it is
 because of the choices is ANY. ANY covers all the types of branches. Therefore
 it does not make a difference whether you add COND or not. And
 vice-versa, if you
 set COND, you need to disable ANY. I bet if you add other filters such
 as CALL, RETURN,
 then you could OR them and say: I want RETURN or CALLS.
 
 But that's okay. The API operates in OR mode but if the HW does not
 support it, you
 can check the mask and reject if more than one type is set. That is
 arch-specific code.
 The alternative, if to only capture ANY and emulate the filter in SW.
 This will work, of
 course. But the downside, is that you lose the way to appreciate how
 many, for instance,
 COND branches you sampled out of the total number of COND branches
 retired. Unless
 you can count COND branches separately.

Hey Stephane,

Thanks for your reply. I am working on a solution where PMU will process
all the requested branch filters in HW only if it can filter all of them in an
OR manner else it will just leave the entire thing upto the SW to process and
do no filtering itself. This implies that branch filtering will either happen
completely in HW or completely in SW and never in a mixed manner. This way
it will conform to the OR mode defined in the API. I will post the revised
patch set soon.

Regards
Anshuman

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-26 Thread Stephane Eranian
On Mon, Sep 23, 2013 at 11:15 AM, Anshuman Khandual
 wrote:
> On 09/21/2013 12:25 PM, Stephane Eranian wrote:
>> On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
>>  wrote:
>>> >
>>> > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
 > >   This patchset is the re-spin of the original branch stack 
 > > sampling
 > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
 > > patchset
 > > also enables SW based branch filtering support for PPC64 platforms 
 > > which have
 > > branch stack sampling support. With this new enablement, the branch 
 > > filter support
 > > for PPC64 platforms have been extended to include all these 
 > > combinations discussed
 > > below with a sample test application program.
>>> >
>>> > ...
>>> >
 > > Mixed filters
 > > -
 > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
 > > Error:
 > > The perf.data file has no samples!
 > >
 > > NOTE: As expected. The HW filters all the branches which are calls and 
 > > SW tries to find return
 > > branches in that given set. Both the filters are mutually exclussive, 
 > > so obviously no samples
 > > found in the end profile.
>>> >
>>> > The semantics of multiple filters is not clear to me. It could be an OR,
>>> > or an AND. You have implemented AND, does that match existing behaviour
>>> > on x86 for example?
>>> >
>> The semantic on the API is OR. AND does not make sense: CALL & RETURN?
>> On x86, the HW filter is an OR (default: ALL, set bit to disable a
>> type). I suspect
>> it is similar on PPC.
>
> Hey Stephane,
>
> In POWER8 BHRB, we have got three HW PMU filters out of which we are trying
> to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND
> respectively.
>
> (1) These filters are exclusive of each other and cannot be OR-ed with each 
> other
>
So you are saying that the HW filter is exclusive. That seems odd. But
I think it is
because of the choices is ANY. ANY covers all the types of branches. Therefore
it does not make a difference whether you add COND or not. And
vice-versa, if you
set COND, you need to disable ANY. I bet if you add other filters such
as CALL, RETURN,
then you could OR them and say: I want RETURN or CALLS.

But that's okay. The API operates in OR mode but if the HW does not
support it, you
can check the mask and reject if more than one type is set. That is
arch-specific code.
The alternative, if to only capture ANY and emulate the filter in SW.
This will work, of
course. But the downside, is that you lose the way to appreciate how
many, for instance,
COND branches you sampled out of the total number of COND branches
retired. Unless
you can count COND branches separately.





> (2) The SW filters are applied on the branch record set captured with BHRB
> which have the HW filters applied. So the working set is already reduced
> with the HW PMU filters. SW filter goes through the working set and 
> figures
> out which one of them satisfy the SW filter criteria and gets picked up. 
> The
> SW filter cannot find out branches records which matches the criteria 
> outside
> of BHRB captured set. So we cannot OR the filters.
>
Yes, you can if you set the HW filter to ANY. And then filter the
branches by type
based on the SW mask. You need to decode each sampled branch for that. This
is done in X86 to work around HW bugs in the HW filter, for instance.

> This makes the combination of HW and SW filter inherently an "AND" not OR.
>
> (3) But once we have captured the BHRB filtered data with HW PMU filter, 
> multiple SW
> filters (if requested) can be applied either in OR or AND manner.
>
> It should be either like
> (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2)
> or like
> (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2)
>
> NOTE: I admit that the current validate_instruction() function does not do
> either of them correctly. Will fix it in the next iteration.
>
Just set the HW filter to ANY and filter in SW.
Isn't that possible?

> (4) These combination of filters are not supported right now because
>
> (a) We are unable to process two HW PMU filters simultaneously
> (b) We have not worked on replacement SW filter for either of the HW 
> filters
>
> (1) (HW_FILTER_1), (HW_FILTER_2)
> (2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1)
> (3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)
>
>How ever these combination of filters can be supported right now.
>
> (1) (HW_FILTER_1)
> (2) (HW_FILTER_2)
>
> (3) (SW_FILTER_1)
> (4) (SW_FILTER_2)
> (5) (SW_FILTER_1), (SW_FILTER_2)
>
> (6)  (HW_FILTER_1), (SW_FILTER_1)
> (7)  (HW_FILTER_1), (SW_FILTER_2)
> (8)  (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2)
> (9)  (HW_FILTER_2), (SW_FILTER_1)
>   

Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-26 Thread Stephane Eranian
On Mon, Sep 23, 2013 at 11:15 AM, Anshuman Khandual
khand...@linux.vnet.ibm.com wrote:
 On 09/21/2013 12:25 PM, Stephane Eranian wrote:
 On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
 mich...@ellerman.id.au wrote:
 
  On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
 This patchset is the re-spin of the original branch stack 
   sampling
   patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
   patchset
   also enables SW based branch filtering support for PPC64 platforms 
   which have
   branch stack sampling support. With this new enablement, the branch 
   filter support
   for PPC64 platforms have been extended to include all these 
   combinations discussed
   below with a sample test application program.
 
  ...
 
   Mixed filters
   -
   (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
   Error:
   The perf.data file has no samples!
  
   NOTE: As expected. The HW filters all the branches which are calls and 
   SW tries to find return
   branches in that given set. Both the filters are mutually exclussive, 
   so obviously no samples
   found in the end profile.
 
  The semantics of multiple filters is not clear to me. It could be an OR,
  or an AND. You have implemented AND, does that match existing behaviour
  on x86 for example?
 
 The semantic on the API is OR. AND does not make sense: CALL  RETURN?
 On x86, the HW filter is an OR (default: ALL, set bit to disable a
 type). I suspect
 it is similar on PPC.

 Hey Stephane,

 In POWER8 BHRB, we have got three HW PMU filters out of which we are trying
 to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND
 respectively.

 (1) These filters are exclusive of each other and cannot be OR-ed with each 
 other

So you are saying that the HW filter is exclusive. That seems odd. But
I think it is
because of the choices is ANY. ANY covers all the types of branches. Therefore
it does not make a difference whether you add COND or not. And
vice-versa, if you
set COND, you need to disable ANY. I bet if you add other filters such
as CALL, RETURN,
then you could OR them and say: I want RETURN or CALLS.

But that's okay. The API operates in OR mode but if the HW does not
support it, you
can check the mask and reject if more than one type is set. That is
arch-specific code.
The alternative, if to only capture ANY and emulate the filter in SW.
This will work, of
course. But the downside, is that you lose the way to appreciate how
many, for instance,
COND branches you sampled out of the total number of COND branches
retired. Unless
you can count COND branches separately.





 (2) The SW filters are applied on the branch record set captured with BHRB
 which have the HW filters applied. So the working set is already reduced
 with the HW PMU filters. SW filter goes through the working set and 
 figures
 out which one of them satisfy the SW filter criteria and gets picked up. 
 The
 SW filter cannot find out branches records which matches the criteria 
 outside
 of BHRB captured set. So we cannot OR the filters.

Yes, you can if you set the HW filter to ANY. And then filter the
branches by type
based on the SW mask. You need to decode each sampled branch for that. This
is done in X86 to work around HW bugs in the HW filter, for instance.

 This makes the combination of HW and SW filter inherently an AND not OR.

 (3) But once we have captured the BHRB filtered data with HW PMU filter, 
 multiple SW
 filters (if requested) can be applied either in OR or AND manner.

 It should be either like
 (1) (HW_FILTER_1)  (SW_FILTER_1)  (SW_FILTER_2)
 or like
 (2) (HW_FILTER_1)  (SW_FILTER_1 || SW_FILTER_2)

 NOTE: I admit that the current validate_instruction() function does not do
 either of them correctly. Will fix it in the next iteration.

Just set the HW filter to ANY and filter in SW.
Isn't that possible?

 (4) These combination of filters are not supported right now because

 (a) We are unable to process two HW PMU filters simultaneously
 (b) We have not worked on replacement SW filter for either of the HW 
 filters

 (1) (HW_FILTER_1), (HW_FILTER_2)
 (2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1)
 (3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)

How ever these combination of filters can be supported right now.

 (1) (HW_FILTER_1)
 (2) (HW_FILTER_2)

 (3) (SW_FILTER_1)
 (4) (SW_FILTER_2)
 (5) (SW_FILTER_1), (SW_FILTER_2)

 (6)  (HW_FILTER_1), (SW_FILTER_1)
 (7)  (HW_FILTER_1), (SW_FILTER_2)
 (8)  (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2)
 (9)  (HW_FILTER_2), (SW_FILTER_1)
 (10) (HW_FILTER_2), (SW_FILTER_2)
 (11) (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)


 Given the situation as explained here, which semantic would be better for 
 single
 HW and multiple 

Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-25 Thread Anshuman Khandual
On 09/25/2013 07:49 AM, Michael Ellerman wrote:
> On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote:
>> On 09/21/2013 12:25 PM, Stephane Eranian wrote:
>>> On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
>>>  wrote:
>
> On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
>>>   This patchset is the re-spin of the original branch stack sampling
>>> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
>>> patchset
>>> also enables SW based branch filtering support for PPC64 platforms 
>>> which have
>>> branch stack sampling support. With this new enablement, the branch 
>>> filter support
>>> for PPC64 platforms have been extended to include all these 
>>> combinations discussed
>>> below with a sample test application program.
>
> ...
>
>>> Mixed filters
>>> -
>>> (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
>>> Error:
>>> The perf.data file has no samples!
>>>
>>> NOTE: As expected. The HW filters all the branches which are calls and 
>>> SW tries to find return
>>> branches in that given set. Both the filters are mutually exclussive, 
>>> so obviously no samples
>>> found in the end profile.
>
> The semantics of multiple filters is not clear to me. It could be an OR,
> or an AND. You have implemented AND, does that match existing behaviour
> on x86 for example?
>>>
>>> The semantic on the API is OR. AND does not make sense: CALL & RETURN?
>>> On x86, the HW filter is an OR (default: ALL, set bit to disable a
>>> type). I suspect
>>> it is similar on PPC.
>>
>> Given the situation as explained here, which semantic would be better for 
>> single
>> HW and multiple SW filters. Accordingly validate_instruction() function will 
>> have
>> to be re-implemented. But I believe OR-ing the SW filters will be preferable.
>>
>>  (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2)
>>  or
>>  (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2)
>>
>> Please let me know your inputs and suggestions on this. Thank you.
> 
> You need to implement the correct semantics, regardless of how the
> hardware happens to work.
> 
> That means if multiple filters are specified you need to do all the
> filtering in software.

Hello Stephane,

I looked at the X86 code on branch filtering implementation.

(1) During event creation intel_pmu_hw_config calls intel_pmu_setup_lbr_filter 
when LBR sampling
is required, intel_pmu_setup_lbr_filter calls these two functions 

(a) intel_pmu_setup_sw_lbr_filter

"event->hw.branch_reg.reg" contains all the SW filter masks which can be
supported for the user requested filters event->attr.branch_sample_type 
(even
if some of them could implemented in PMU HW)

(b) intel_pmu_setup_hw_lbr_filter (when HW filtering is present)

"event->hw.branch_reg.config" contains all the PMU HW filter masks 
corresponding
to the requested filters in event->attr.branch_sample_type. One point 
to note
here is that if the user has requested for some branch filter which is 
not supported
in the HW LBR filter, the event creation request is rejected with 
EOPNOTSUPP. This
not true for the filters which can be ignored in the PMU.

(2) When the event is enabled in the PMU

(a) cpuc->lbr_sel->config gets into the HW register to enable the 
filtering of branches
which was determined in the function intel_pmu_setup_hw_lbr_filter. 

(3) After the IRQ happened, intel_pmu_lbr_read reads all the entries from the 
LBR  HW and then
applies the filter in the function intel_pmu_lbr_filter.

(a) intel_pmu_lbr_filter functions take into account cpuc->br_sel 
(which is nothing but
event->hw.branch_reg.reg as determined in the function 
intel_pmu_setup_sw_lbr_filter)
which contains the entire branch filter request set in terms applicable 
SW filter. Here
the semantic is OR when we look at from SW filter implementation point 
of view.

   BUT what branch record set we are working on right now ? A set which was 
captured with LBR HW
   with cpuc->lbr_sel->config filters enabled on it. So to me the X86 
implementation of the semantics
   look something like this.

A - Branch filter set requested by the user
B - Subset of A which can be supported in HW
C - Subset of A which can be supported in SW

(B) && (C) 

NOTE: Individual filters are OR-ed inside both B and C sets.

So here the semantics is not a true OR. This is my understanding till now which 
may be wrong. Please
help me understand if the semantics is something otherwise than what is 
explained above.

In POWER8 because we cannot OR individual HW PMU supported filters, till now 
the semantics looked a bit odd.
But as Michael has pointed out here that if there are multiple branch filter 
requests implement all 

Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-25 Thread Anshuman Khandual
On 09/25/2013 07:49 AM, Michael Ellerman wrote:
 On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote:
 On 09/21/2013 12:25 PM, Stephane Eranian wrote:
 On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
 mich...@ellerman.id.au wrote:

 On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
   This patchset is the re-spin of the original branch stack sampling
 patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
 patchset
 also enables SW based branch filtering support for PPC64 platforms 
 which have
 branch stack sampling support. With this new enablement, the branch 
 filter support
 for PPC64 platforms have been extended to include all these 
 combinations discussed
 below with a sample test application program.

 ...

 Mixed filters
 -
 (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
 Error:
 The perf.data file has no samples!

 NOTE: As expected. The HW filters all the branches which are calls and 
 SW tries to find return
 branches in that given set. Both the filters are mutually exclussive, 
 so obviously no samples
 found in the end profile.

 The semantics of multiple filters is not clear to me. It could be an OR,
 or an AND. You have implemented AND, does that match existing behaviour
 on x86 for example?

 The semantic on the API is OR. AND does not make sense: CALL  RETURN?
 On x86, the HW filter is an OR (default: ALL, set bit to disable a
 type). I suspect
 it is similar on PPC.

 Given the situation as explained here, which semantic would be better for 
 single
 HW and multiple SW filters. Accordingly validate_instruction() function will 
 have
 to be re-implemented. But I believe OR-ing the SW filters will be preferable.

  (1) (HW_FILTER_1)  (SW_FILTER_1)  (SW_FILTER_2)
  or
  (2) (HW_FILTER_1)  (SW_FILTER_1 || SW_FILTER_2)

 Please let me know your inputs and suggestions on this. Thank you.
 
 You need to implement the correct semantics, regardless of how the
 hardware happens to work.
 
 That means if multiple filters are specified you need to do all the
 filtering in software.

Hello Stephane,

I looked at the X86 code on branch filtering implementation.

(1) During event creation intel_pmu_hw_config calls intel_pmu_setup_lbr_filter 
when LBR sampling
is required, intel_pmu_setup_lbr_filter calls these two functions 

(a) intel_pmu_setup_sw_lbr_filter

event-hw.branch_reg.reg contains all the SW filter masks which can be
supported for the user requested filters event-attr.branch_sample_type 
(even
if some of them could implemented in PMU HW)

(b) intel_pmu_setup_hw_lbr_filter (when HW filtering is present)

event-hw.branch_reg.config contains all the PMU HW filter masks 
corresponding
to the requested filters in event-attr.branch_sample_type. One point 
to note
here is that if the user has requested for some branch filter which is 
not supported
in the HW LBR filter, the event creation request is rejected with 
EOPNOTSUPP. This
not true for the filters which can be ignored in the PMU.

(2) When the event is enabled in the PMU

(a) cpuc-lbr_sel-config gets into the HW register to enable the 
filtering of branches
which was determined in the function intel_pmu_setup_hw_lbr_filter. 

(3) After the IRQ happened, intel_pmu_lbr_read reads all the entries from the 
LBR  HW and then
applies the filter in the function intel_pmu_lbr_filter.

(a) intel_pmu_lbr_filter functions take into account cpuc-br_sel 
(which is nothing but
event-hw.branch_reg.reg as determined in the function 
intel_pmu_setup_sw_lbr_filter)
which contains the entire branch filter request set in terms applicable 
SW filter. Here
the semantic is OR when we look at from SW filter implementation point 
of view.

   BUT what branch record set we are working on right now ? A set which was 
captured with LBR HW
   with cpuc-lbr_sel-config filters enabled on it. So to me the X86 
implementation of the semantics
   look something like this.

A - Branch filter set requested by the user
B - Subset of A which can be supported in HW
C - Subset of A which can be supported in SW

(B)  (C) 

NOTE: Individual filters are OR-ed inside both B and C sets.

So here the semantics is not a true OR. This is my understanding till now which 
may be wrong. Please
help me understand if the semantics is something otherwise than what is 
explained above.

In POWER8 because we cannot OR individual HW PMU supported filters, till now 
the semantics looked a bit odd.
But as Michael has pointed out here that if there are multiple branch filter 
requests implement all of them
in SW. Only in case where the user requests for an individual filter and if it 
happen to be supported in HW
PMU, we will use the PMU filters.

Regards
Anshuman

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in

Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-24 Thread Michael Ellerman
On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote:
> On 09/21/2013 12:25 PM, Stephane Eranian wrote:
> > On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
> >  wrote:
> >> >
> >> > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
> >>> > >   This patchset is the re-spin of the original branch stack 
> >>> > > sampling
> >>> > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
> >>> > > patchset
> >>> > > also enables SW based branch filtering support for PPC64 platforms 
> >>> > > which have
> >>> > > branch stack sampling support. With this new enablement, the branch 
> >>> > > filter support
> >>> > > for PPC64 platforms have been extended to include all these 
> >>> > > combinations discussed
> >>> > > below with a sample test application program.
> >> >
> >> > ...
> >> >
> >>> > > Mixed filters
> >>> > > -
> >>> > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
> >>> > > Error:
> >>> > > The perf.data file has no samples!
> >>> > >
> >>> > > NOTE: As expected. The HW filters all the branches which are calls 
> >>> > > and SW tries to find return
> >>> > > branches in that given set. Both the filters are mutually exclussive, 
> >>> > > so obviously no samples
> >>> > > found in the end profile.
> >> >
> >> > The semantics of multiple filters is not clear to me. It could be an OR,
> >> > or an AND. You have implemented AND, does that match existing behaviour
> >> > on x86 for example?
> >
> > The semantic on the API is OR. AND does not make sense: CALL & RETURN?
> > On x86, the HW filter is an OR (default: ALL, set bit to disable a
> > type). I suspect
> > it is similar on PPC.
> 
> Given the situation as explained here, which semantic would be better for 
> single
> HW and multiple SW filters. Accordingly validate_instruction() function will 
> have
> to be re-implemented. But I believe OR-ing the SW filters will be preferable.
> 
>   (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2)
>   or
>   (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2)
> 
> Please let me know your inputs and suggestions on this. Thank you.

You need to implement the correct semantics, regardless of how the
hardware happens to work.

That means if multiple filters are specified you need to do all the
filtering in software.

cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-24 Thread Michael Ellerman
On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote:
 On 09/21/2013 12:25 PM, Stephane Eranian wrote:
  On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
  mich...@ellerman.id.au wrote:
  
   On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
  This patchset is the re-spin of the original branch stack 
sampling
patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
patchset
also enables SW based branch filtering support for PPC64 platforms 
which have
branch stack sampling support. With this new enablement, the branch 
filter support
for PPC64 platforms have been extended to include all these 
combinations discussed
below with a sample test application program.
  
   ...
  
Mixed filters
-
(6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
Error:
The perf.data file has no samples!
   
NOTE: As expected. The HW filters all the branches which are calls 
and SW tries to find return
branches in that given set. Both the filters are mutually exclussive, 
so obviously no samples
found in the end profile.
  
   The semantics of multiple filters is not clear to me. It could be an OR,
   or an AND. You have implemented AND, does that match existing behaviour
   on x86 for example?
 
  The semantic on the API is OR. AND does not make sense: CALL  RETURN?
  On x86, the HW filter is an OR (default: ALL, set bit to disable a
  type). I suspect
  it is similar on PPC.
 
 Given the situation as explained here, which semantic would be better for 
 single
 HW and multiple SW filters. Accordingly validate_instruction() function will 
 have
 to be re-implemented. But I believe OR-ing the SW filters will be preferable.
 
   (1) (HW_FILTER_1)  (SW_FILTER_1)  (SW_FILTER_2)
   or
   (2) (HW_FILTER_1)  (SW_FILTER_1 || SW_FILTER_2)
 
 Please let me know your inputs and suggestions on this. Thank you.

You need to implement the correct semantics, regardless of how the
hardware happens to work.

That means if multiple filters are specified you need to do all the
filtering in software.

cheers

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-23 Thread Anshuman Khandual
On 09/21/2013 12:25 PM, Stephane Eranian wrote:
> On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
>  wrote:
>> >
>> > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
>>> > >   This patchset is the re-spin of the original branch stack sampling
>>> > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
>>> > > patchset
>>> > > also enables SW based branch filtering support for PPC64 platforms 
>>> > > which have
>>> > > branch stack sampling support. With this new enablement, the branch 
>>> > > filter support
>>> > > for PPC64 platforms have been extended to include all these 
>>> > > combinations discussed
>>> > > below with a sample test application program.
>> >
>> > ...
>> >
>>> > > Mixed filters
>>> > > -
>>> > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
>>> > > Error:
>>> > > The perf.data file has no samples!
>>> > >
>>> > > NOTE: As expected. The HW filters all the branches which are calls and 
>>> > > SW tries to find return
>>> > > branches in that given set. Both the filters are mutually exclussive, 
>>> > > so obviously no samples
>>> > > found in the end profile.
>> >
>> > The semantics of multiple filters is not clear to me. It could be an OR,
>> > or an AND. You have implemented AND, does that match existing behaviour
>> > on x86 for example?
>> >
> The semantic on the API is OR. AND does not make sense: CALL & RETURN?
> On x86, the HW filter is an OR (default: ALL, set bit to disable a
> type). I suspect
> it is similar on PPC.

Hey Stephane,

In POWER8 BHRB, we have got three HW PMU filters out of which we are trying
to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND
respectively.

(1) These filters are exclusive of each other and cannot be OR-ed with each 
other

(2) The SW filters are applied on the branch record set captured with BHRB
which have the HW filters applied. So the working set is already reduced
with the HW PMU filters. SW filter goes through the working set and figures
out which one of them satisfy the SW filter criteria and gets picked up. The
SW filter cannot find out branches records which matches the criteria 
outside
of BHRB captured set. So we cannot OR the filters.

This makes the combination of HW and SW filter inherently an "AND" not OR.

(3) But once we have captured the BHRB filtered data with HW PMU filter, 
multiple SW
filters (if requested) can be applied either in OR or AND manner.

It should be either like
(1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2)
or like
(2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2)

NOTE: I admit that the current validate_instruction() function does not do
either of them correctly. Will fix it in the next iteration.

(4) These combination of filters are not supported right now because

(a) We are unable to process two HW PMU filters simultaneously
(b) We have not worked on replacement SW filter for either of the HW 
filters

(1) (HW_FILTER_1), (HW_FILTER_2)
(2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1)
(3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)

   How ever these combination of filters can be supported right now.

(1) (HW_FILTER_1)
(2) (HW_FILTER_2)

(3) (SW_FILTER_1)
(4) (SW_FILTER_2)
(5) (SW_FILTER_1), (SW_FILTER_2)

(6)  (HW_FILTER_1), (SW_FILTER_1)
(7)  (HW_FILTER_1), (SW_FILTER_2)
(8)  (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2)
(9)  (HW_FILTER_2), (SW_FILTER_1)
(10) (HW_FILTER_2), (SW_FILTER_2)
(11) (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)


Given the situation as explained here, which semantic would be better for single
HW and multiple SW filters. Accordingly validate_instruction() function will 
have
to be re-implemented. But I believe OR-ing the SW filters will be preferable.

(1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2)
or
(2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2)

Please let me know your inputs and suggestions on this. Thank you.

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-23 Thread Anshuman Khandual
On 09/21/2013 12:25 PM, Stephane Eranian wrote:
 On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
 mich...@ellerman.id.au wrote:
 
  On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
 This patchset is the re-spin of the original branch stack sampling
   patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This 
   patchset
   also enables SW based branch filtering support for PPC64 platforms 
   which have
   branch stack sampling support. With this new enablement, the branch 
   filter support
   for PPC64 platforms have been extended to include all these 
   combinations discussed
   below with a sample test application program.
 
  ...
 
   Mixed filters
   -
   (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
   Error:
   The perf.data file has no samples!
  
   NOTE: As expected. The HW filters all the branches which are calls and 
   SW tries to find return
   branches in that given set. Both the filters are mutually exclussive, 
   so obviously no samples
   found in the end profile.
 
  The semantics of multiple filters is not clear to me. It could be an OR,
  or an AND. You have implemented AND, does that match existing behaviour
  on x86 for example?
 
 The semantic on the API is OR. AND does not make sense: CALL  RETURN?
 On x86, the HW filter is an OR (default: ALL, set bit to disable a
 type). I suspect
 it is similar on PPC.

Hey Stephane,

In POWER8 BHRB, we have got three HW PMU filters out of which we are trying
to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND
respectively.

(1) These filters are exclusive of each other and cannot be OR-ed with each 
other

(2) The SW filters are applied on the branch record set captured with BHRB
which have the HW filters applied. So the working set is already reduced
with the HW PMU filters. SW filter goes through the working set and figures
out which one of them satisfy the SW filter criteria and gets picked up. The
SW filter cannot find out branches records which matches the criteria 
outside
of BHRB captured set. So we cannot OR the filters.

This makes the combination of HW and SW filter inherently an AND not OR.

(3) But once we have captured the BHRB filtered data with HW PMU filter, 
multiple SW
filters (if requested) can be applied either in OR or AND manner.

It should be either like
(1) (HW_FILTER_1)  (SW_FILTER_1)  (SW_FILTER_2)
or like
(2) (HW_FILTER_1)  (SW_FILTER_1 || SW_FILTER_2)

NOTE: I admit that the current validate_instruction() function does not do
either of them correctly. Will fix it in the next iteration.

(4) These combination of filters are not supported right now because

(a) We are unable to process two HW PMU filters simultaneously
(b) We have not worked on replacement SW filter for either of the HW 
filters

(1) (HW_FILTER_1), (HW_FILTER_2)
(2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1)
(3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)

   How ever these combination of filters can be supported right now.

(1) (HW_FILTER_1)
(2) (HW_FILTER_2)

(3) (SW_FILTER_1)
(4) (SW_FILTER_2)
(5) (SW_FILTER_1), (SW_FILTER_2)

(6)  (HW_FILTER_1), (SW_FILTER_1)
(7)  (HW_FILTER_1), (SW_FILTER_2)
(8)  (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2)
(9)  (HW_FILTER_2), (SW_FILTER_1)
(10) (HW_FILTER_2), (SW_FILTER_2)
(11) (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2)


Given the situation as explained here, which semantic would be better for single
HW and multiple SW filters. Accordingly validate_instruction() function will 
have
to be re-implemented. But I believe OR-ing the SW filters will be preferable.

(1) (HW_FILTER_1)  (SW_FILTER_1)  (SW_FILTER_2)
or
(2) (HW_FILTER_1)  (SW_FILTER_1 || SW_FILTER_2)

Please let me know your inputs and suggestions on this. Thank you.

Regards
Anshuman

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-21 Thread Stephane Eranian
On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
 wrote:
>
> On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
> >   This patchset is the re-spin of the original branch stack sampling
> > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
> > also enables SW based branch filtering support for PPC64 platforms which 
> > have
> > branch stack sampling support. With this new enablement, the branch filter 
> > support
> > for PPC64 platforms have been extended to include all these combinations 
> > discussed
> > below with a sample test application program.
>
> ...
>
> > Mixed filters
> > -
> > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
> > Error:
> > The perf.data file has no samples!
> >
> > NOTE: As expected. The HW filters all the branches which are calls and SW 
> > tries to find return
> > branches in that given set. Both the filters are mutually exclussive, so 
> > obviously no samples
> > found in the end profile.
>
> The semantics of multiple filters is not clear to me. It could be an OR,
> or an AND. You have implemented AND, does that match existing behaviour
> on x86 for example?
>
The semantic on the API is OR. AND does not make sense: CALL & RETURN?
On x86, the HW filter is an OR (default: ALL, set bit to disable a
type). I suspect
it is similar on PPC.

>
> cheers
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-21 Thread Anshuman Khandual
On 09/21/2013 12:11 PM, Anshuman Khandual wrote:
> On 08/30/2013 05:18 PM, Stephane Eranian wrote:
>> 2013/8/30 Anshuman Khandual 

 This patchset is the re-spin of the original branch stack sampling
 patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
 also enables SW based branch filtering support for PPC64 platforms which 
 have
 branch stack sampling support. With this new enablement, the branch filter 
 support
 for PPC64 platforms have been extended to include all these combinations 
 discussed
 below with a sample test application program.


>> I am trying to understand which HW has support for capturing the
>> branches: PPC7 or PPC8.
>> Then it seems you're saying that only PPC8 has the filtering support.
>> On PPC7 you use the
>> SW filter. Did I get this right?
>>
>> I will look at the patch set.
>>
> 
> Hey Stephane,
> 
> Just wondering if you got a chance to go though the patchset ?


s/though/through/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-21 Thread Anshuman Khandual
On 08/30/2013 05:18 PM, Stephane Eranian wrote:
> 2013/8/30 Anshuman Khandual 
>> >
>> > This patchset is the re-spin of the original branch stack sampling
>> > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
>> > also enables SW based branch filtering support for PPC64 platforms which 
>> > have
>> > branch stack sampling support. With this new enablement, the branch filter 
>> > support
>> > for PPC64 platforms have been extended to include all these combinations 
>> > discussed
>> > below with a sample test application program.
>> >
>> >
> I am trying to understand which HW has support for capturing the
> branches: PPC7 or PPC8.
> Then it seems you're saying that only PPC8 has the filtering support.
> On PPC7 you use the
> SW filter. Did I get this right?
> 
> I will look at the patch set.
> 

Hey Stephane,

Just wondering if you got a chance to go though the patchset ?

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-21 Thread Anshuman Khandual
On 08/30/2013 05:18 PM, Stephane Eranian wrote:
 2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com
 
  This patchset is the re-spin of the original branch stack sampling
  patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
  also enables SW based branch filtering support for PPC64 platforms which 
  have
  branch stack sampling support. With this new enablement, the branch filter 
  support
  for PPC64 platforms have been extended to include all these combinations 
  discussed
  below with a sample test application program.
 
 
 I am trying to understand which HW has support for capturing the
 branches: PPC7 or PPC8.
 Then it seems you're saying that only PPC8 has the filtering support.
 On PPC7 you use the
 SW filter. Did I get this right?
 
 I will look at the patch set.
 

Hey Stephane,

Just wondering if you got a chance to go though the patchset ?

Regards
Anshuman

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-21 Thread Anshuman Khandual
On 09/21/2013 12:11 PM, Anshuman Khandual wrote:
 On 08/30/2013 05:18 PM, Stephane Eranian wrote:
 2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com

 This patchset is the re-spin of the original branch stack sampling
 patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
 also enables SW based branch filtering support for PPC64 platforms which 
 have
 branch stack sampling support. With this new enablement, the branch filter 
 support
 for PPC64 platforms have been extended to include all these combinations 
 discussed
 below with a sample test application program.


 I am trying to understand which HW has support for capturing the
 branches: PPC7 or PPC8.
 Then it seems you're saying that only PPC8 has the filtering support.
 On PPC7 you use the
 SW filter. Did I get this right?

 I will look at the patch set.

 
 Hey Stephane,
 
 Just wondering if you got a chance to go though the patchset ?


s/though/through/

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-21 Thread Stephane Eranian
On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman
mich...@ellerman.id.au wrote:

 On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
This patchset is the re-spin of the original branch stack sampling
  patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
  also enables SW based branch filtering support for PPC64 platforms which 
  have
  branch stack sampling support. With this new enablement, the branch filter 
  support
  for PPC64 platforms have been extended to include all these combinations 
  discussed
  below with a sample test application program.

 ...

  Mixed filters
  -
  (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
  Error:
  The perf.data file has no samples!
 
  NOTE: As expected. The HW filters all the branches which are calls and SW 
  tries to find return
  branches in that given set. Both the filters are mutually exclussive, so 
  obviously no samples
  found in the end profile.

 The semantics of multiple filters is not clear to me. It could be an OR,
 or an AND. You have implemented AND, does that match existing behaviour
 on x86 for example?

The semantic on the API is OR. AND does not make sense: CALL  RETURN?
On x86, the HW filter is an OR (default: ALL, set bit to disable a
type). I suspect
it is similar on PPC.


 cheers


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-09 Thread Anshuman Khandual
On 09/10/2013 07:36 AM, Michael Ellerman wrote:
> On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
>>  This patchset is the re-spin of the original branch stack sampling
>> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
>> also enables SW based branch filtering support for PPC64 platforms which have
>> branch stack sampling support. With this new enablement, the branch filter 
>> support
>> for PPC64 platforms have been extended to include all these combinations 
>> discussed
>> below with a sample test application program.
> 
> ...
> 
>> Mixed filters
>> -
>> (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
>> Error:
>> The perf.data file has no samples!
>>
>> NOTE: As expected. The HW filters all the branches which are calls and SW 
>> tries to find return
>> branches in that given set. Both the filters are mutually exclussive, so 
>> obviously no samples
>> found in the end profile.
> 
> The semantics of multiple filters is not clear to me. It could be an OR,
> or an AND. You have implemented AND, does that match existing behaviour
> on x86 for example?

I believe it does match. X86 code drops the branch records (originally captured
in the LBR) while applying the SW filters.

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-09 Thread Michael Ellerman
On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
>   This patchset is the re-spin of the original branch stack sampling
> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
> also enables SW based branch filtering support for PPC64 platforms which have
> branch stack sampling support. With this new enablement, the branch filter 
> support
> for PPC64 platforms have been extended to include all these combinations 
> discussed
> below with a sample test application program.

...

> Mixed filters
> -
> (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
> Error:
> The perf.data file has no samples!
> 
> NOTE: As expected. The HW filters all the branches which are calls and SW 
> tries to find return
> branches in that given set. Both the filters are mutually exclussive, so 
> obviously no samples
> found in the end profile.

The semantics of multiple filters is not clear to me. It could be an OR,
or an AND. You have implemented AND, does that match existing behaviour
on x86 for example?

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-09 Thread Michael Ellerman
On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
   This patchset is the re-spin of the original branch stack sampling
 patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
 also enables SW based branch filtering support for PPC64 platforms which have
 branch stack sampling support. With this new enablement, the branch filter 
 support
 for PPC64 platforms have been extended to include all these combinations 
 discussed
 below with a sample test application program.

...

 Mixed filters
 -
 (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
 Error:
 The perf.data file has no samples!
 
 NOTE: As expected. The HW filters all the branches which are calls and SW 
 tries to find return
 branches in that given set. Both the filters are mutually exclussive, so 
 obviously no samples
 found in the end profile.

The semantics of multiple filters is not clear to me. It could be an OR,
or an AND. You have implemented AND, does that match existing behaviour
on x86 for example?

cheers


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-09 Thread Anshuman Khandual
On 09/10/2013 07:36 AM, Michael Ellerman wrote:
 On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote:
  This patchset is the re-spin of the original branch stack sampling
 patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
 also enables SW based branch filtering support for PPC64 platforms which have
 branch stack sampling support. With this new enablement, the branch filter 
 support
 for PPC64 platforms have been extended to include all these combinations 
 discussed
 below with a sample test application program.
 
 ...
 
 Mixed filters
 -
 (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog
 Error:
 The perf.data file has no samples!

 NOTE: As expected. The HW filters all the branches which are calls and SW 
 tries to find return
 branches in that given set. Both the filters are mutually exclussive, so 
 obviously no samples
 found in the end profile.
 
 The semantics of multiple filters is not clear to me. It could be an OR,
 or an AND. You have implemented AND, does that match existing behaviour
 on x86 for example?

I believe it does match. X86 code drops the branch records (originally captured
in the LBR) while applying the SW filters.

Regards
Anshuman

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-01 Thread Anshuman Khandual
On 08/30/2013 05:18 PM, Stephane Eranian wrote:
> 2013/8/30 Anshuman Khandual 
>> >
>> > This patchset is the re-spin of the original branch stack sampling
>> > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
>> > also enables SW based branch filtering support for PPC64 platforms which 
>> > have
>> > branch stack sampling support. With this new enablement, the branch filter 
>> > support
>> > for PPC64 platforms have been extended to include all these combinations 
>> > discussed
>> > below with a sample test application program.
>> >
>> >
> I am trying to understand which HW has support for capturing the
> branches: PPC7 or PPC8.
> Then it seems you're saying that only PPC8 has the filtering support.
> On PPC7 you use the
> SW filter. Did I get this right?
> 
> I will look at the patch set.
> 

Hey Stephane,

POWER7 does not have BHRB support required to capture the branches. Right
now its only POWER8 (which has BHRB) can capture branches in HW. It has some
PMU level branch filters and rest we have implemented in SW. But these SW
filters cannot be applied in POWER7 as it does not support branch stack 
sampling because of lack of BHRB. I have mentioned PPC64 support in the
sense that this SW filtering code could be used in existing or future generation
powerpc processors which would have PMU support for branch stack sampling. My
apologies if the description for the patchset was ambiguous.

Regards
Anshuman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-09-01 Thread Anshuman Khandual
On 08/30/2013 05:18 PM, Stephane Eranian wrote:
 2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com
 
  This patchset is the re-spin of the original branch stack sampling
  patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
  also enables SW based branch filtering support for PPC64 platforms which 
  have
  branch stack sampling support. With this new enablement, the branch filter 
  support
  for PPC64 platforms have been extended to include all these combinations 
  discussed
  below with a sample test application program.
 
 
 I am trying to understand which HW has support for capturing the
 branches: PPC7 or PPC8.
 Then it seems you're saying that only PPC8 has the filtering support.
 On PPC7 you use the
 SW filter. Did I get this right?
 
 I will look at the patch set.
 

Hey Stephane,

POWER7 does not have BHRB support required to capture the branches. Right
now its only POWER8 (which has BHRB) can capture branches in HW. It has some
PMU level branch filters and rest we have implemented in SW. But these SW
filters cannot be applied in POWER7 as it does not support branch stack 
sampling because of lack of BHRB. I have mentioned PPC64 support in the
sense that this SW filtering code could be used in existing or future generation
powerpc processors which would have PMU support for branch stack sampling. My
apologies if the description for the patchset was ambiguous.

Regards
Anshuman

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-08-30 Thread Stephane Eranian
2013/8/30 Anshuman Khandual 
>
> This patchset is the re-spin of the original branch stack sampling
> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
> also enables SW based branch filtering support for PPC64 platforms which have
> branch stack sampling support. With this new enablement, the branch filter 
> support
> for PPC64 platforms have been extended to include all these combinations 
> discussed
> below with a sample test application program.
>
>
I am trying to understand which HW has support for capturing the
branches: PPC7 or PPC8.
Then it seems you're saying that only PPC8 has the filtering support.
On PPC7 you use the
SW filter. Did I get this right?

I will look at the patch set.

>
> (1) perf record -e branch-misses:u -b ./cprog
> # Overhead  Command  Source Shared Object  Source Symbol  Target 
> Shared Object  Target Symbol
> #   ...    .  
>   .
> #
>  4.42%cprog  cprog [k] sw_4_2 cprog   
>   [k] lr_addr
>  4.41%cprog  cprog [k] symbol2cprog   
>   [k] hw_1_2
>  4.41%cprog  cprog [k] ctr_addr   cprog   
>   [k] sw_4_1
>  4.41%cprog  cprog [k] lr_addrcprog   
>   [k] sw_4_2
>  4.41%cprog  cprog [k] sw_4_2 cprog   
>   [k] callme
>  4.41%cprog  cprog [k] symbol1cprog   
>   [k] hw_1_1
>  4.41%cprog  cprog [k] success_3_1_3  cprog   
>   [k] sw_3_1
>  2.43%cprog  cprog [k] sw_4_1 cprog   
>   [k] ctr_addr
>  2.43%cprog  cprog [k] hw_1_2 cprog   
>   [k] symbol2
>  2.43%cprog  cprog [k] callme cprog   
>   [k] hw_1_2
>  2.43%cprog  cprog [k] address1   cprog   
>   [k] back1
>  2.43%cprog  cprog [k] back1  cprog   
>   [k] callme
>  2.43%cprog  cprog [k] hw_2_1 cprog   
>   [k] address1
>  2.43%cprog  cprog [k] sw_3_1_1   cprog   
>   [k] sw_3_1
>  2.43%cprog  cprog [k] sw_3_1_2   cprog   
>   [k] sw_3_1
>  2.43%cprog  cprog [k] sw_3_1_3   cprog   
>   [k] sw_3_1
>  2.43%cprog  cprog [k] sw_3_1 cprog   
>   [k] sw_3_1_1
>  2.43%cprog  cprog [k] sw_3_1 cprog   
>   [k] sw_3_1_2
>  2.43%cprog  cprog [k] sw_3_1 cprog   
>   [k] sw_3_1_3
>  2.43%cprog  cprog [k] callme cprog   
>   [k] sw_3_1
>  2.43%cprog  cprog [k] callme cprog   
>   [k] sw_4_2
>  2.43%cprog  cprog [k] hw_1_1 cprog   
>   [k] symbol1
>  2.43%cprog  cprog [k] callme cprog   
>   [k] hw_1_1
>  2.42%cprog  cprog [k] sw_3_1 cprog   
>   [k] callme
>  1.99%cprog  cprog [k] success_3_1_1  cprog   
>   [k] sw_3_1
>  1.99%cprog  cprog [k] sw_3_1 cprog   
>   [k] success_3_1_1
>  1.99%cprog  cprog [k] address2   cprog   
>   [k] back2
>  1.99%cprog  cprog [k] hw_2_2 cprog   
>   [k] address2
>  1.99%cprog  cprog [k] back2  cprog   
>   [k] callme
>  1.99%cprog  cprog [k] callme cprog   
>   [k] main
>  1.99%cprog  cprog [k] sw_3_1 cprog   
>   [k] success_3_1_3
>  1.99%cprog  cprog [k] hw_1_1 cprog   
>   [k] callme
>  1.99%cprog  cprog [k] sw_3_2 cprog   
>   [k] callme
>  1.99%cprog  cprog [k] callme cprog   
>   [k] sw_3_2
>  1.99%cprog  cprog [k] success_3_1_2  cprog   
>   [k] sw_3_1
>  1.99%cprog  cprog [k] sw_3_1 cprog   
>   [k] success_3_1_2
>  1.99%cprog  cprog [k] hw_1_2 cprog   
>   [k] callme
>  1.99%cprog  cprog [k] sw_4_1 cprog   
>   [k] 

Re: [PATCH V2 0/6] perf: New conditional branch filter

2013-08-30 Thread Stephane Eranian
2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com

 This patchset is the re-spin of the original branch stack sampling
 patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
 also enables SW based branch filtering support for PPC64 platforms which have
 branch stack sampling support. With this new enablement, the branch filter 
 support
 for PPC64 platforms have been extended to include all these combinations 
 discussed
 below with a sample test application program.


I am trying to understand which HW has support for capturing the
branches: PPC7 or PPC8.
Then it seems you're saying that only PPC8 has the filtering support.
On PPC7 you use the
SW filter. Did I get this right?

I will look at the patch set.


 (1) perf record -e branch-misses:u -b ./cprog
 # Overhead  Command  Source Shared Object  Source Symbol  Target 
 Shared Object  Target Symbol
 #   ...    .  
   .
 #
  4.42%cprog  cprog [k] sw_4_2 cprog   
   [k] lr_addr
  4.41%cprog  cprog [k] symbol2cprog   
   [k] hw_1_2
  4.41%cprog  cprog [k] ctr_addr   cprog   
   [k] sw_4_1
  4.41%cprog  cprog [k] lr_addrcprog   
   [k] sw_4_2
  4.41%cprog  cprog [k] sw_4_2 cprog   
   [k] callme
  4.41%cprog  cprog [k] symbol1cprog   
   [k] hw_1_1
  4.41%cprog  cprog [k] success_3_1_3  cprog   
   [k] sw_3_1
  2.43%cprog  cprog [k] sw_4_1 cprog   
   [k] ctr_addr
  2.43%cprog  cprog [k] hw_1_2 cprog   
   [k] symbol2
  2.43%cprog  cprog [k] callme cprog   
   [k] hw_1_2
  2.43%cprog  cprog [k] address1   cprog   
   [k] back1
  2.43%cprog  cprog [k] back1  cprog   
   [k] callme
  2.43%cprog  cprog [k] hw_2_1 cprog   
   [k] address1
  2.43%cprog  cprog [k] sw_3_1_1   cprog   
   [k] sw_3_1
  2.43%cprog  cprog [k] sw_3_1_2   cprog   
   [k] sw_3_1
  2.43%cprog  cprog [k] sw_3_1_3   cprog   
   [k] sw_3_1
  2.43%cprog  cprog [k] sw_3_1 cprog   
   [k] sw_3_1_1
  2.43%cprog  cprog [k] sw_3_1 cprog   
   [k] sw_3_1_2
  2.43%cprog  cprog [k] sw_3_1 cprog   
   [k] sw_3_1_3
  2.43%cprog  cprog [k] callme cprog   
   [k] sw_3_1
  2.43%cprog  cprog [k] callme cprog   
   [k] sw_4_2
  2.43%cprog  cprog [k] hw_1_1 cprog   
   [k] symbol1
  2.43%cprog  cprog [k] callme cprog   
   [k] hw_1_1
  2.42%cprog  cprog [k] sw_3_1 cprog   
   [k] callme
  1.99%cprog  cprog [k] success_3_1_1  cprog   
   [k] sw_3_1
  1.99%cprog  cprog [k] sw_3_1 cprog   
   [k] success_3_1_1
  1.99%cprog  cprog [k] address2   cprog   
   [k] back2
  1.99%cprog  cprog [k] hw_2_2 cprog   
   [k] address2
  1.99%cprog  cprog [k] back2  cprog   
   [k] callme
  1.99%cprog  cprog [k] callme cprog   
   [k] main
  1.99%cprog  cprog [k] sw_3_1 cprog   
   [k] success_3_1_3
  1.99%cprog  cprog [k] hw_1_1 cprog   
   [k] callme
  1.99%cprog  cprog [k] sw_3_2 cprog   
   [k] callme
  1.99%cprog  cprog [k] callme cprog   
   [k] sw_3_2
  1.99%cprog  cprog [k] success_3_1_2  cprog   
   [k] sw_3_1
  1.99%cprog  cprog [k] sw_3_1 cprog   
   [k] success_3_1_2
  1.99%cprog  cprog [k] hw_1_2 cprog   
   [k] callme
  1.99%cprog  cprog [k] sw_4_1 cprog   
   [k] callme
  0.02%cprog  [unknown] [k] 0xf7ba2328  

[PATCH V2 0/6] perf: New conditional branch filter

2013-08-29 Thread Anshuman Khandual
This patchset is the re-spin of the original branch stack sampling
patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
also enables SW based branch filtering support for PPC64 platforms which have
branch stack sampling support. With this new enablement, the branch filter 
support
for PPC64 platforms have been extended to include all these combinations 
discussed
below with a sample test application program.


(1) perf record -e branch-misses:u -b ./cprog
# Overhead  Command  Source Shared Object  Source Symbol  Target Shared 
Object  Target Symbol
#   ...    .  
  .
#
 4.42%cprog  cprog [k] sw_4_2 cprog 
[k] lr_addr  
 4.41%cprog  cprog [k] symbol2cprog 
[k] hw_1_2   
 4.41%cprog  cprog [k] ctr_addr   cprog 
[k] sw_4_1   
 4.41%cprog  cprog [k] lr_addrcprog 
[k] sw_4_2   
 4.41%cprog  cprog [k] sw_4_2 cprog 
[k] callme   
 4.41%cprog  cprog [k] symbol1cprog 
[k] hw_1_1   
 4.41%cprog  cprog [k] success_3_1_3  cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_4_1 cprog 
[k] ctr_addr 
 2.43%cprog  cprog [k] hw_1_2 cprog 
[k] symbol2  
 2.43%cprog  cprog [k] callme cprog 
[k] hw_1_2   
 2.43%cprog  cprog [k] address1   cprog 
[k] back1
 2.43%cprog  cprog [k] back1  cprog 
[k] callme   
 2.43%cprog  cprog [k] hw_2_1 cprog 
[k] address1 
 2.43%cprog  cprog [k] sw_3_1_1   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1_2   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1_3   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_1 
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_2 
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_3 
 2.43%cprog  cprog [k] callme cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] callme cprog 
[k] sw_4_2   
 2.43%cprog  cprog [k] hw_1_1 cprog 
[k] symbol1  
 2.43%cprog  cprog [k] callme cprog 
[k] hw_1_1   
 2.42%cprog  cprog [k] sw_3_1 cprog 
[k] callme   
 1.99%cprog  cprog [k] success_3_1_1  cprog 
[k] sw_3_1   
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_1
 1.99%cprog  cprog [k] address2   cprog 
[k] back2
 1.99%cprog  cprog [k] hw_2_2 cprog 
[k] address2 
 1.99%cprog  cprog [k] back2  cprog 
[k] callme   
 1.99%cprog  cprog [k] callme cprog 
[k] main 
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_3
 1.99%cprog  cprog [k] hw_1_1 cprog 
[k] callme   
 1.99%cprog  cprog [k] sw_3_2 cprog 
[k] callme   
 1.99%cprog  cprog [k] callme cprog 
[k] sw_3_2   
 1.99%cprog  cprog [k] success_3_1_2  cprog 
[k] sw_3_1   
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_2
 1.99%cprog  cprog [k] hw_1_2 cprog 
[k] callme   
 1.99%cprog  cprog [k] sw_4_1 cprog 
[k] callme   
 0.02%cprog  [unknown] [k] 0xf7ba2328 

[PATCH V2 0/6] perf: New conditional branch filter

2013-08-29 Thread Anshuman Khandual
This patchset is the re-spin of the original branch stack sampling
patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset
also enables SW based branch filtering support for PPC64 platforms which have
branch stack sampling support. With this new enablement, the branch filter 
support
for PPC64 platforms have been extended to include all these combinations 
discussed
below with a sample test application program.


(1) perf record -e branch-misses:u -b ./cprog
# Overhead  Command  Source Shared Object  Source Symbol  Target Shared 
Object  Target Symbol
#   ...    .  
  .
#
 4.42%cprog  cprog [k] sw_4_2 cprog 
[k] lr_addr  
 4.41%cprog  cprog [k] symbol2cprog 
[k] hw_1_2   
 4.41%cprog  cprog [k] ctr_addr   cprog 
[k] sw_4_1   
 4.41%cprog  cprog [k] lr_addrcprog 
[k] sw_4_2   
 4.41%cprog  cprog [k] sw_4_2 cprog 
[k] callme   
 4.41%cprog  cprog [k] symbol1cprog 
[k] hw_1_1   
 4.41%cprog  cprog [k] success_3_1_3  cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_4_1 cprog 
[k] ctr_addr 
 2.43%cprog  cprog [k] hw_1_2 cprog 
[k] symbol2  
 2.43%cprog  cprog [k] callme cprog 
[k] hw_1_2   
 2.43%cprog  cprog [k] address1   cprog 
[k] back1
 2.43%cprog  cprog [k] back1  cprog 
[k] callme   
 2.43%cprog  cprog [k] hw_2_1 cprog 
[k] address1 
 2.43%cprog  cprog [k] sw_3_1_1   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1_2   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1_3   cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_1 
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_2 
 2.43%cprog  cprog [k] sw_3_1 cprog 
[k] sw_3_1_3 
 2.43%cprog  cprog [k] callme cprog 
[k] sw_3_1   
 2.43%cprog  cprog [k] callme cprog 
[k] sw_4_2   
 2.43%cprog  cprog [k] hw_1_1 cprog 
[k] symbol1  
 2.43%cprog  cprog [k] callme cprog 
[k] hw_1_1   
 2.42%cprog  cprog [k] sw_3_1 cprog 
[k] callme   
 1.99%cprog  cprog [k] success_3_1_1  cprog 
[k] sw_3_1   
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_1
 1.99%cprog  cprog [k] address2   cprog 
[k] back2
 1.99%cprog  cprog [k] hw_2_2 cprog 
[k] address2 
 1.99%cprog  cprog [k] back2  cprog 
[k] callme   
 1.99%cprog  cprog [k] callme cprog 
[k] main 
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_3
 1.99%cprog  cprog [k] hw_1_1 cprog 
[k] callme   
 1.99%cprog  cprog [k] sw_3_2 cprog 
[k] callme   
 1.99%cprog  cprog [k] callme cprog 
[k] sw_3_2   
 1.99%cprog  cprog [k] success_3_1_2  cprog 
[k] sw_3_1   
 1.99%cprog  cprog [k] sw_3_1 cprog 
[k] success_3_1_2
 1.99%cprog  cprog [k] hw_1_2 cprog 
[k] callme   
 1.99%cprog  cprog [k] sw_4_1 cprog 
[k] callme   
 0.02%cprog  [unknown] [k] 0xf7ba2328