Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/26/2013 04:44 PM, Stephane Eranian wrote: > So you are saying that the HW filter is exclusive. That seems odd. But > I think it is > because of the choices is ANY. ANY covers all the types of branches. Therefore > it does not make a difference whether you add COND or not. And > vice-versa, if you > set COND, you need to disable ANY. I bet if you add other filters such > as CALL, RETURN, > then you could OR them and say: I want RETURN or CALLS. > > But that's okay. The API operates in OR mode but if the HW does not > support it, you > can check the mask and reject if more than one type is set. That is > arch-specific code. > The alternative, if to only capture ANY and emulate the filter in SW. > This will work, of > course. But the downside, is that you lose the way to appreciate how > many, for instance, > COND branches you sampled out of the total number of COND branches > retired. Unless > you can count COND branches separately. Hey Stephane, Thanks for your reply. I am working on a solution where PMU will process all the requested branch filters in HW only if it can filter all of them in an OR manner else it will just leave the entire thing upto the SW to process and do no filtering itself. This implies that branch filtering will either happen completely in HW or completely in SW and never in a mixed manner. This way it will conform to the OR mode defined in the API. I will post the revised patch set soon. Regards Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/26/2013 04:44 PM, Stephane Eranian wrote: So you are saying that the HW filter is exclusive. That seems odd. But I think it is because of the choices is ANY. ANY covers all the types of branches. Therefore it does not make a difference whether you add COND or not. And vice-versa, if you set COND, you need to disable ANY. I bet if you add other filters such as CALL, RETURN, then you could OR them and say: I want RETURN or CALLS. But that's okay. The API operates in OR mode but if the HW does not support it, you can check the mask and reject if more than one type is set. That is arch-specific code. The alternative, if to only capture ANY and emulate the filter in SW. This will work, of course. But the downside, is that you lose the way to appreciate how many, for instance, COND branches you sampled out of the total number of COND branches retired. Unless you can count COND branches separately. Hey Stephane, Thanks for your reply. I am working on a solution where PMU will process all the requested branch filters in HW only if it can filter all of them in an OR manner else it will just leave the entire thing upto the SW to process and do no filtering itself. This implies that branch filtering will either happen completely in HW or completely in SW and never in a mixed manner. This way it will conform to the OR mode defined in the API. I will post the revised patch set soon. Regards Anshuman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Mon, Sep 23, 2013 at 11:15 AM, Anshuman Khandual wrote: > On 09/21/2013 12:25 PM, Stephane Eranian wrote: >> On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman >> wrote: >>> > >>> > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: > > This patchset is the re-spin of the original branch stack > > sampling > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This > > patchset > > also enables SW based branch filtering support for PPC64 platforms > > which have > > branch stack sampling support. With this new enablement, the branch > > filter support > > for PPC64 platforms have been extended to include all these > > combinations discussed > > below with a sample test application program. >>> > >>> > ... >>> > > > Mixed filters > > - > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog > > Error: > > The perf.data file has no samples! > > > > NOTE: As expected. The HW filters all the branches which are calls and > > SW tries to find return > > branches in that given set. Both the filters are mutually exclussive, > > so obviously no samples > > found in the end profile. >>> > >>> > The semantics of multiple filters is not clear to me. It could be an OR, >>> > or an AND. You have implemented AND, does that match existing behaviour >>> > on x86 for example? >>> > >> The semantic on the API is OR. AND does not make sense: CALL & RETURN? >> On x86, the HW filter is an OR (default: ALL, set bit to disable a >> type). I suspect >> it is similar on PPC. > > Hey Stephane, > > In POWER8 BHRB, we have got three HW PMU filters out of which we are trying > to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND > respectively. > > (1) These filters are exclusive of each other and cannot be OR-ed with each > other > So you are saying that the HW filter is exclusive. That seems odd. But I think it is because of the choices is ANY. ANY covers all the types of branches. Therefore it does not make a difference whether you add COND or not. And vice-versa, if you set COND, you need to disable ANY. I bet if you add other filters such as CALL, RETURN, then you could OR them and say: I want RETURN or CALLS. But that's okay. The API operates in OR mode but if the HW does not support it, you can check the mask and reject if more than one type is set. That is arch-specific code. The alternative, if to only capture ANY and emulate the filter in SW. This will work, of course. But the downside, is that you lose the way to appreciate how many, for instance, COND branches you sampled out of the total number of COND branches retired. Unless you can count COND branches separately. > (2) The SW filters are applied on the branch record set captured with BHRB > which have the HW filters applied. So the working set is already reduced > with the HW PMU filters. SW filter goes through the working set and > figures > out which one of them satisfy the SW filter criteria and gets picked up. > The > SW filter cannot find out branches records which matches the criteria > outside > of BHRB captured set. So we cannot OR the filters. > Yes, you can if you set the HW filter to ANY. And then filter the branches by type based on the SW mask. You need to decode each sampled branch for that. This is done in X86 to work around HW bugs in the HW filter, for instance. > This makes the combination of HW and SW filter inherently an "AND" not OR. > > (3) But once we have captured the BHRB filtered data with HW PMU filter, > multiple SW > filters (if requested) can be applied either in OR or AND manner. > > It should be either like > (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2) > or like > (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2) > > NOTE: I admit that the current validate_instruction() function does not do > either of them correctly. Will fix it in the next iteration. > Just set the HW filter to ANY and filter in SW. Isn't that possible? > (4) These combination of filters are not supported right now because > > (a) We are unable to process two HW PMU filters simultaneously > (b) We have not worked on replacement SW filter for either of the HW > filters > > (1) (HW_FILTER_1), (HW_FILTER_2) > (2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1) > (3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) > >How ever these combination of filters can be supported right now. > > (1) (HW_FILTER_1) > (2) (HW_FILTER_2) > > (3) (SW_FILTER_1) > (4) (SW_FILTER_2) > (5) (SW_FILTER_1), (SW_FILTER_2) > > (6) (HW_FILTER_1), (SW_FILTER_1) > (7) (HW_FILTER_1), (SW_FILTER_2) > (8) (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2) > (9) (HW_FILTER_2), (SW_FILTER_1) >
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Mon, Sep 23, 2013 at 11:15 AM, Anshuman Khandual khand...@linux.vnet.ibm.com wrote: On 09/21/2013 12:25 PM, Stephane Eranian wrote: On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman mich...@ellerman.id.au wrote: On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? The semantic on the API is OR. AND does not make sense: CALL RETURN? On x86, the HW filter is an OR (default: ALL, set bit to disable a type). I suspect it is similar on PPC. Hey Stephane, In POWER8 BHRB, we have got three HW PMU filters out of which we are trying to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND respectively. (1) These filters are exclusive of each other and cannot be OR-ed with each other So you are saying that the HW filter is exclusive. That seems odd. But I think it is because of the choices is ANY. ANY covers all the types of branches. Therefore it does not make a difference whether you add COND or not. And vice-versa, if you set COND, you need to disable ANY. I bet if you add other filters such as CALL, RETURN, then you could OR them and say: I want RETURN or CALLS. But that's okay. The API operates in OR mode but if the HW does not support it, you can check the mask and reject if more than one type is set. That is arch-specific code. The alternative, if to only capture ANY and emulate the filter in SW. This will work, of course. But the downside, is that you lose the way to appreciate how many, for instance, COND branches you sampled out of the total number of COND branches retired. Unless you can count COND branches separately. (2) The SW filters are applied on the branch record set captured with BHRB which have the HW filters applied. So the working set is already reduced with the HW PMU filters. SW filter goes through the working set and figures out which one of them satisfy the SW filter criteria and gets picked up. The SW filter cannot find out branches records which matches the criteria outside of BHRB captured set. So we cannot OR the filters. Yes, you can if you set the HW filter to ANY. And then filter the branches by type based on the SW mask. You need to decode each sampled branch for that. This is done in X86 to work around HW bugs in the HW filter, for instance. This makes the combination of HW and SW filter inherently an AND not OR. (3) But once we have captured the BHRB filtered data with HW PMU filter, multiple SW filters (if requested) can be applied either in OR or AND manner. It should be either like (1) (HW_FILTER_1) (SW_FILTER_1) (SW_FILTER_2) or like (2) (HW_FILTER_1) (SW_FILTER_1 || SW_FILTER_2) NOTE: I admit that the current validate_instruction() function does not do either of them correctly. Will fix it in the next iteration. Just set the HW filter to ANY and filter in SW. Isn't that possible? (4) These combination of filters are not supported right now because (a) We are unable to process two HW PMU filters simultaneously (b) We have not worked on replacement SW filter for either of the HW filters (1) (HW_FILTER_1), (HW_FILTER_2) (2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1) (3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) How ever these combination of filters can be supported right now. (1) (HW_FILTER_1) (2) (HW_FILTER_2) (3) (SW_FILTER_1) (4) (SW_FILTER_2) (5) (SW_FILTER_1), (SW_FILTER_2) (6) (HW_FILTER_1), (SW_FILTER_1) (7) (HW_FILTER_1), (SW_FILTER_2) (8) (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2) (9) (HW_FILTER_2), (SW_FILTER_1) (10) (HW_FILTER_2), (SW_FILTER_2) (11) (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) Given the situation as explained here, which semantic would be better for single HW and multiple
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/25/2013 07:49 AM, Michael Ellerman wrote: > On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote: >> On 09/21/2013 12:25 PM, Stephane Eranian wrote: >>> On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman >>> wrote: > > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: >>> This patchset is the re-spin of the original branch stack sampling >>> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This >>> patchset >>> also enables SW based branch filtering support for PPC64 platforms >>> which have >>> branch stack sampling support. With this new enablement, the branch >>> filter support >>> for PPC64 platforms have been extended to include all these >>> combinations discussed >>> below with a sample test application program. > > ... > >>> Mixed filters >>> - >>> (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog >>> Error: >>> The perf.data file has no samples! >>> >>> NOTE: As expected. The HW filters all the branches which are calls and >>> SW tries to find return >>> branches in that given set. Both the filters are mutually exclussive, >>> so obviously no samples >>> found in the end profile. > > The semantics of multiple filters is not clear to me. It could be an OR, > or an AND. You have implemented AND, does that match existing behaviour > on x86 for example? >>> >>> The semantic on the API is OR. AND does not make sense: CALL & RETURN? >>> On x86, the HW filter is an OR (default: ALL, set bit to disable a >>> type). I suspect >>> it is similar on PPC. >> >> Given the situation as explained here, which semantic would be better for >> single >> HW and multiple SW filters. Accordingly validate_instruction() function will >> have >> to be re-implemented. But I believe OR-ing the SW filters will be preferable. >> >> (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2) >> or >> (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2) >> >> Please let me know your inputs and suggestions on this. Thank you. > > You need to implement the correct semantics, regardless of how the > hardware happens to work. > > That means if multiple filters are specified you need to do all the > filtering in software. Hello Stephane, I looked at the X86 code on branch filtering implementation. (1) During event creation intel_pmu_hw_config calls intel_pmu_setup_lbr_filter when LBR sampling is required, intel_pmu_setup_lbr_filter calls these two functions (a) intel_pmu_setup_sw_lbr_filter "event->hw.branch_reg.reg" contains all the SW filter masks which can be supported for the user requested filters event->attr.branch_sample_type (even if some of them could implemented in PMU HW) (b) intel_pmu_setup_hw_lbr_filter (when HW filtering is present) "event->hw.branch_reg.config" contains all the PMU HW filter masks corresponding to the requested filters in event->attr.branch_sample_type. One point to note here is that if the user has requested for some branch filter which is not supported in the HW LBR filter, the event creation request is rejected with EOPNOTSUPP. This not true for the filters which can be ignored in the PMU. (2) When the event is enabled in the PMU (a) cpuc->lbr_sel->config gets into the HW register to enable the filtering of branches which was determined in the function intel_pmu_setup_hw_lbr_filter. (3) After the IRQ happened, intel_pmu_lbr_read reads all the entries from the LBR HW and then applies the filter in the function intel_pmu_lbr_filter. (a) intel_pmu_lbr_filter functions take into account cpuc->br_sel (which is nothing but event->hw.branch_reg.reg as determined in the function intel_pmu_setup_sw_lbr_filter) which contains the entire branch filter request set in terms applicable SW filter. Here the semantic is OR when we look at from SW filter implementation point of view. BUT what branch record set we are working on right now ? A set which was captured with LBR HW with cpuc->lbr_sel->config filters enabled on it. So to me the X86 implementation of the semantics look something like this. A - Branch filter set requested by the user B - Subset of A which can be supported in HW C - Subset of A which can be supported in SW (B) && (C) NOTE: Individual filters are OR-ed inside both B and C sets. So here the semantics is not a true OR. This is my understanding till now which may be wrong. Please help me understand if the semantics is something otherwise than what is explained above. In POWER8 because we cannot OR individual HW PMU supported filters, till now the semantics looked a bit odd. But as Michael has pointed out here that if there are multiple branch filter requests implement all
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/25/2013 07:49 AM, Michael Ellerman wrote: On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote: On 09/21/2013 12:25 PM, Stephane Eranian wrote: On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman mich...@ellerman.id.au wrote: On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? The semantic on the API is OR. AND does not make sense: CALL RETURN? On x86, the HW filter is an OR (default: ALL, set bit to disable a type). I suspect it is similar on PPC. Given the situation as explained here, which semantic would be better for single HW and multiple SW filters. Accordingly validate_instruction() function will have to be re-implemented. But I believe OR-ing the SW filters will be preferable. (1) (HW_FILTER_1) (SW_FILTER_1) (SW_FILTER_2) or (2) (HW_FILTER_1) (SW_FILTER_1 || SW_FILTER_2) Please let me know your inputs and suggestions on this. Thank you. You need to implement the correct semantics, regardless of how the hardware happens to work. That means if multiple filters are specified you need to do all the filtering in software. Hello Stephane, I looked at the X86 code on branch filtering implementation. (1) During event creation intel_pmu_hw_config calls intel_pmu_setup_lbr_filter when LBR sampling is required, intel_pmu_setup_lbr_filter calls these two functions (a) intel_pmu_setup_sw_lbr_filter event-hw.branch_reg.reg contains all the SW filter masks which can be supported for the user requested filters event-attr.branch_sample_type (even if some of them could implemented in PMU HW) (b) intel_pmu_setup_hw_lbr_filter (when HW filtering is present) event-hw.branch_reg.config contains all the PMU HW filter masks corresponding to the requested filters in event-attr.branch_sample_type. One point to note here is that if the user has requested for some branch filter which is not supported in the HW LBR filter, the event creation request is rejected with EOPNOTSUPP. This not true for the filters which can be ignored in the PMU. (2) When the event is enabled in the PMU (a) cpuc-lbr_sel-config gets into the HW register to enable the filtering of branches which was determined in the function intel_pmu_setup_hw_lbr_filter. (3) After the IRQ happened, intel_pmu_lbr_read reads all the entries from the LBR HW and then applies the filter in the function intel_pmu_lbr_filter. (a) intel_pmu_lbr_filter functions take into account cpuc-br_sel (which is nothing but event-hw.branch_reg.reg as determined in the function intel_pmu_setup_sw_lbr_filter) which contains the entire branch filter request set in terms applicable SW filter. Here the semantic is OR when we look at from SW filter implementation point of view. BUT what branch record set we are working on right now ? A set which was captured with LBR HW with cpuc-lbr_sel-config filters enabled on it. So to me the X86 implementation of the semantics look something like this. A - Branch filter set requested by the user B - Subset of A which can be supported in HW C - Subset of A which can be supported in SW (B) (C) NOTE: Individual filters are OR-ed inside both B and C sets. So here the semantics is not a true OR. This is my understanding till now which may be wrong. Please help me understand if the semantics is something otherwise than what is explained above. In POWER8 because we cannot OR individual HW PMU supported filters, till now the semantics looked a bit odd. But as Michael has pointed out here that if there are multiple branch filter requests implement all of them in SW. Only in case where the user requests for an individual filter and if it happen to be supported in HW PMU, we will use the PMU filters. Regards Anshuman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote: > On 09/21/2013 12:25 PM, Stephane Eranian wrote: > > On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman > > wrote: > >> > > >> > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: > >>> > > This patchset is the re-spin of the original branch stack > >>> > > sampling > >>> > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This > >>> > > patchset > >>> > > also enables SW based branch filtering support for PPC64 platforms > >>> > > which have > >>> > > branch stack sampling support. With this new enablement, the branch > >>> > > filter support > >>> > > for PPC64 platforms have been extended to include all these > >>> > > combinations discussed > >>> > > below with a sample test application program. > >> > > >> > ... > >> > > >>> > > Mixed filters > >>> > > - > >>> > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog > >>> > > Error: > >>> > > The perf.data file has no samples! > >>> > > > >>> > > NOTE: As expected. The HW filters all the branches which are calls > >>> > > and SW tries to find return > >>> > > branches in that given set. Both the filters are mutually exclussive, > >>> > > so obviously no samples > >>> > > found in the end profile. > >> > > >> > The semantics of multiple filters is not clear to me. It could be an OR, > >> > or an AND. You have implemented AND, does that match existing behaviour > >> > on x86 for example? > > > > The semantic on the API is OR. AND does not make sense: CALL & RETURN? > > On x86, the HW filter is an OR (default: ALL, set bit to disable a > > type). I suspect > > it is similar on PPC. > > Given the situation as explained here, which semantic would be better for > single > HW and multiple SW filters. Accordingly validate_instruction() function will > have > to be re-implemented. But I believe OR-ing the SW filters will be preferable. > > (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2) > or > (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2) > > Please let me know your inputs and suggestions on this. Thank you. You need to implement the correct semantics, regardless of how the hardware happens to work. That means if multiple filters are specified you need to do all the filtering in software. cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Mon, 2013-09-23 at 14:45 +0530, Anshuman Khandual wrote: On 09/21/2013 12:25 PM, Stephane Eranian wrote: On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman mich...@ellerman.id.au wrote: On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? The semantic on the API is OR. AND does not make sense: CALL RETURN? On x86, the HW filter is an OR (default: ALL, set bit to disable a type). I suspect it is similar on PPC. Given the situation as explained here, which semantic would be better for single HW and multiple SW filters. Accordingly validate_instruction() function will have to be re-implemented. But I believe OR-ing the SW filters will be preferable. (1) (HW_FILTER_1) (SW_FILTER_1) (SW_FILTER_2) or (2) (HW_FILTER_1) (SW_FILTER_1 || SW_FILTER_2) Please let me know your inputs and suggestions on this. Thank you. You need to implement the correct semantics, regardless of how the hardware happens to work. That means if multiple filters are specified you need to do all the filtering in software. cheers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/21/2013 12:25 PM, Stephane Eranian wrote: > On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman > wrote: >> > >> > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: >>> > > This patchset is the re-spin of the original branch stack sampling >>> > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This >>> > > patchset >>> > > also enables SW based branch filtering support for PPC64 platforms >>> > > which have >>> > > branch stack sampling support. With this new enablement, the branch >>> > > filter support >>> > > for PPC64 platforms have been extended to include all these >>> > > combinations discussed >>> > > below with a sample test application program. >> > >> > ... >> > >>> > > Mixed filters >>> > > - >>> > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog >>> > > Error: >>> > > The perf.data file has no samples! >>> > > >>> > > NOTE: As expected. The HW filters all the branches which are calls and >>> > > SW tries to find return >>> > > branches in that given set. Both the filters are mutually exclussive, >>> > > so obviously no samples >>> > > found in the end profile. >> > >> > The semantics of multiple filters is not clear to me. It could be an OR, >> > or an AND. You have implemented AND, does that match existing behaviour >> > on x86 for example? >> > > The semantic on the API is OR. AND does not make sense: CALL & RETURN? > On x86, the HW filter is an OR (default: ALL, set bit to disable a > type). I suspect > it is similar on PPC. Hey Stephane, In POWER8 BHRB, we have got three HW PMU filters out of which we are trying to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND respectively. (1) These filters are exclusive of each other and cannot be OR-ed with each other (2) The SW filters are applied on the branch record set captured with BHRB which have the HW filters applied. So the working set is already reduced with the HW PMU filters. SW filter goes through the working set and figures out which one of them satisfy the SW filter criteria and gets picked up. The SW filter cannot find out branches records which matches the criteria outside of BHRB captured set. So we cannot OR the filters. This makes the combination of HW and SW filter inherently an "AND" not OR. (3) But once we have captured the BHRB filtered data with HW PMU filter, multiple SW filters (if requested) can be applied either in OR or AND manner. It should be either like (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2) or like (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2) NOTE: I admit that the current validate_instruction() function does not do either of them correctly. Will fix it in the next iteration. (4) These combination of filters are not supported right now because (a) We are unable to process two HW PMU filters simultaneously (b) We have not worked on replacement SW filter for either of the HW filters (1) (HW_FILTER_1), (HW_FILTER_2) (2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1) (3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) How ever these combination of filters can be supported right now. (1) (HW_FILTER_1) (2) (HW_FILTER_2) (3) (SW_FILTER_1) (4) (SW_FILTER_2) (5) (SW_FILTER_1), (SW_FILTER_2) (6) (HW_FILTER_1), (SW_FILTER_1) (7) (HW_FILTER_1), (SW_FILTER_2) (8) (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2) (9) (HW_FILTER_2), (SW_FILTER_1) (10) (HW_FILTER_2), (SW_FILTER_2) (11) (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) Given the situation as explained here, which semantic would be better for single HW and multiple SW filters. Accordingly validate_instruction() function will have to be re-implemented. But I believe OR-ing the SW filters will be preferable. (1) (HW_FILTER_1) && (SW_FILTER_1) && (SW_FILTER_2) or (2) (HW_FILTER_1) && (SW_FILTER_1 || SW_FILTER_2) Please let me know your inputs and suggestions on this. Thank you. Regards Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/21/2013 12:25 PM, Stephane Eranian wrote: On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman mich...@ellerman.id.au wrote: On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? The semantic on the API is OR. AND does not make sense: CALL RETURN? On x86, the HW filter is an OR (default: ALL, set bit to disable a type). I suspect it is similar on PPC. Hey Stephane, In POWER8 BHRB, we have got three HW PMU filters out of which we are trying to use two of them PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND respectively. (1) These filters are exclusive of each other and cannot be OR-ed with each other (2) The SW filters are applied on the branch record set captured with BHRB which have the HW filters applied. So the working set is already reduced with the HW PMU filters. SW filter goes through the working set and figures out which one of them satisfy the SW filter criteria and gets picked up. The SW filter cannot find out branches records which matches the criteria outside of BHRB captured set. So we cannot OR the filters. This makes the combination of HW and SW filter inherently an AND not OR. (3) But once we have captured the BHRB filtered data with HW PMU filter, multiple SW filters (if requested) can be applied either in OR or AND manner. It should be either like (1) (HW_FILTER_1) (SW_FILTER_1) (SW_FILTER_2) or like (2) (HW_FILTER_1) (SW_FILTER_1 || SW_FILTER_2) NOTE: I admit that the current validate_instruction() function does not do either of them correctly. Will fix it in the next iteration. (4) These combination of filters are not supported right now because (a) We are unable to process two HW PMU filters simultaneously (b) We have not worked on replacement SW filter for either of the HW filters (1) (HW_FILTER_1), (HW_FILTER_2) (2) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1) (3) (HW_FILTER_1), (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) How ever these combination of filters can be supported right now. (1) (HW_FILTER_1) (2) (HW_FILTER_2) (3) (SW_FILTER_1) (4) (SW_FILTER_2) (5) (SW_FILTER_1), (SW_FILTER_2) (6) (HW_FILTER_1), (SW_FILTER_1) (7) (HW_FILTER_1), (SW_FILTER_2) (8) (HW_FILTER_1), (SW_FILTER_1), (SW_FILTER_2) (9) (HW_FILTER_2), (SW_FILTER_1) (10) (HW_FILTER_2), (SW_FILTER_2) (11) (HW_FILTER_2), (SW_FILTER_1), (SW_FILTER_2) Given the situation as explained here, which semantic would be better for single HW and multiple SW filters. Accordingly validate_instruction() function will have to be re-implemented. But I believe OR-ing the SW filters will be preferable. (1) (HW_FILTER_1) (SW_FILTER_1) (SW_FILTER_2) or (2) (HW_FILTER_1) (SW_FILTER_1 || SW_FILTER_2) Please let me know your inputs and suggestions on this. Thank you. Regards Anshuman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman wrote: > > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: > > This patchset is the re-spin of the original branch stack sampling > > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset > > also enables SW based branch filtering support for PPC64 platforms which > > have > > branch stack sampling support. With this new enablement, the branch filter > > support > > for PPC64 platforms have been extended to include all these combinations > > discussed > > below with a sample test application program. > > ... > > > Mixed filters > > - > > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog > > Error: > > The perf.data file has no samples! > > > > NOTE: As expected. The HW filters all the branches which are calls and SW > > tries to find return > > branches in that given set. Both the filters are mutually exclussive, so > > obviously no samples > > found in the end profile. > > The semantics of multiple filters is not clear to me. It could be an OR, > or an AND. You have implemented AND, does that match existing behaviour > on x86 for example? > The semantic on the API is OR. AND does not make sense: CALL & RETURN? On x86, the HW filter is an OR (default: ALL, set bit to disable a type). I suspect it is similar on PPC. > > cheers > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/21/2013 12:11 PM, Anshuman Khandual wrote: > On 08/30/2013 05:18 PM, Stephane Eranian wrote: >> 2013/8/30 Anshuman Khandual This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. >> I am trying to understand which HW has support for capturing the >> branches: PPC7 or PPC8. >> Then it seems you're saying that only PPC8 has the filtering support. >> On PPC7 you use the >> SW filter. Did I get this right? >> >> I will look at the patch set. >> > > Hey Stephane, > > Just wondering if you got a chance to go though the patchset ? s/though/through/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 08/30/2013 05:18 PM, Stephane Eranian wrote: > 2013/8/30 Anshuman Khandual >> > >> > This patchset is the re-spin of the original branch stack sampling >> > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset >> > also enables SW based branch filtering support for PPC64 platforms which >> > have >> > branch stack sampling support. With this new enablement, the branch filter >> > support >> > for PPC64 platforms have been extended to include all these combinations >> > discussed >> > below with a sample test application program. >> > >> > > I am trying to understand which HW has support for capturing the > branches: PPC7 or PPC8. > Then it seems you're saying that only PPC8 has the filtering support. > On PPC7 you use the > SW filter. Did I get this right? > > I will look at the patch set. > Hey Stephane, Just wondering if you got a chance to go though the patchset ? Regards Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 08/30/2013 05:18 PM, Stephane Eranian wrote: 2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. I am trying to understand which HW has support for capturing the branches: PPC7 or PPC8. Then it seems you're saying that only PPC8 has the filtering support. On PPC7 you use the SW filter. Did I get this right? I will look at the patch set. Hey Stephane, Just wondering if you got a chance to go though the patchset ? Regards Anshuman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/21/2013 12:11 PM, Anshuman Khandual wrote: On 08/30/2013 05:18 PM, Stephane Eranian wrote: 2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. I am trying to understand which HW has support for capturing the branches: PPC7 or PPC8. Then it seems you're saying that only PPC8 has the filtering support. On PPC7 you use the SW filter. Did I get this right? I will look at the patch set. Hey Stephane, Just wondering if you got a chance to go though the patchset ? s/though/through/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Tue, Sep 10, 2013 at 4:06 AM, Michael Ellerman mich...@ellerman.id.au wrote: On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? The semantic on the API is OR. AND does not make sense: CALL RETURN? On x86, the HW filter is an OR (default: ALL, set bit to disable a type). I suspect it is similar on PPC. cheers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/10/2013 07:36 AM, Michael Ellerman wrote: > On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: >> This patchset is the re-spin of the original branch stack sampling >> patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset >> also enables SW based branch filtering support for PPC64 platforms which have >> branch stack sampling support. With this new enablement, the branch filter >> support >> for PPC64 platforms have been extended to include all these combinations >> discussed >> below with a sample test application program. > > ... > >> Mixed filters >> - >> (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog >> Error: >> The perf.data file has no samples! >> >> NOTE: As expected. The HW filters all the branches which are calls and SW >> tries to find return >> branches in that given set. Both the filters are mutually exclussive, so >> obviously no samples >> found in the end profile. > > The semantics of multiple filters is not clear to me. It could be an OR, > or an AND. You have implemented AND, does that match existing behaviour > on x86 for example? I believe it does match. X86 code drops the branch records (originally captured in the LBR) while applying the SW filters. Regards Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: > This patchset is the re-spin of the original branch stack sampling > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset > also enables SW based branch filtering support for PPC64 platforms which have > branch stack sampling support. With this new enablement, the branch filter > support > for PPC64 platforms have been extended to include all these combinations > discussed > below with a sample test application program. ... > Mixed filters > - > (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog > Error: > The perf.data file has no samples! > > NOTE: As expected. The HW filters all the branches which are calls and SW > tries to find return > branches in that given set. Both the filters are mutually exclussive, so > obviously no samples > found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? cheers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 09/10/2013 07:36 AM, Michael Ellerman wrote: On Fri, 2013-08-30 at 09:54 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. ... Mixed filters - (6) perf record -e branch-misses:u -j any_call,any_ret ./cprog Error: The perf.data file has no samples! NOTE: As expected. The HW filters all the branches which are calls and SW tries to find return branches in that given set. Both the filters are mutually exclussive, so obviously no samples found in the end profile. The semantics of multiple filters is not clear to me. It could be an OR, or an AND. You have implemented AND, does that match existing behaviour on x86 for example? I believe it does match. X86 code drops the branch records (originally captured in the LBR) while applying the SW filters. Regards Anshuman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 08/30/2013 05:18 PM, Stephane Eranian wrote: > 2013/8/30 Anshuman Khandual >> > >> > This patchset is the re-spin of the original branch stack sampling >> > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset >> > also enables SW based branch filtering support for PPC64 platforms which >> > have >> > branch stack sampling support. With this new enablement, the branch filter >> > support >> > for PPC64 platforms have been extended to include all these combinations >> > discussed >> > below with a sample test application program. >> > >> > > I am trying to understand which HW has support for capturing the > branches: PPC7 or PPC8. > Then it seems you're saying that only PPC8 has the filtering support. > On PPC7 you use the > SW filter. Did I get this right? > > I will look at the patch set. > Hey Stephane, POWER7 does not have BHRB support required to capture the branches. Right now its only POWER8 (which has BHRB) can capture branches in HW. It has some PMU level branch filters and rest we have implemented in SW. But these SW filters cannot be applied in POWER7 as it does not support branch stack sampling because of lack of BHRB. I have mentioned PPC64 support in the sense that this SW filtering code could be used in existing or future generation powerpc processors which would have PMU support for branch stack sampling. My apologies if the description for the patchset was ambiguous. Regards Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
On 08/30/2013 05:18 PM, Stephane Eranian wrote: 2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. I am trying to understand which HW has support for capturing the branches: PPC7 or PPC8. Then it seems you're saying that only PPC8 has the filtering support. On PPC7 you use the SW filter. Did I get this right? I will look at the patch set. Hey Stephane, POWER7 does not have BHRB support required to capture the branches. Right now its only POWER8 (which has BHRB) can capture branches in HW. It has some PMU level branch filters and rest we have implemented in SW. But these SW filters cannot be applied in POWER7 as it does not support branch stack sampling because of lack of BHRB. I have mentioned PPC64 support in the sense that this SW filtering code could be used in existing or future generation powerpc processors which would have PMU support for branch stack sampling. My apologies if the description for the patchset was ambiguous. Regards Anshuman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/6] perf: New conditional branch filter
2013/8/30 Anshuman Khandual > > This patchset is the re-spin of the original branch stack sampling > patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset > also enables SW based branch filtering support for PPC64 platforms which have > branch stack sampling support. With this new enablement, the branch filter > support > for PPC64 platforms have been extended to include all these combinations > discussed > below with a sample test application program. > > I am trying to understand which HW has support for capturing the branches: PPC7 or PPC8. Then it seems you're saying that only PPC8 has the filtering support. On PPC7 you use the SW filter. Did I get this right? I will look at the patch set. > > (1) perf record -e branch-misses:u -b ./cprog > # Overhead Command Source Shared Object Source Symbol Target > Shared Object Target Symbol > # ... . > . > # > 4.42%cprog cprog [k] sw_4_2 cprog > [k] lr_addr > 4.41%cprog cprog [k] symbol2cprog > [k] hw_1_2 > 4.41%cprog cprog [k] ctr_addr cprog > [k] sw_4_1 > 4.41%cprog cprog [k] lr_addrcprog > [k] sw_4_2 > 4.41%cprog cprog [k] sw_4_2 cprog > [k] callme > 4.41%cprog cprog [k] symbol1cprog > [k] hw_1_1 > 4.41%cprog cprog [k] success_3_1_3 cprog > [k] sw_3_1 > 2.43%cprog cprog [k] sw_4_1 cprog > [k] ctr_addr > 2.43%cprog cprog [k] hw_1_2 cprog > [k] symbol2 > 2.43%cprog cprog [k] callme cprog > [k] hw_1_2 > 2.43%cprog cprog [k] address1 cprog > [k] back1 > 2.43%cprog cprog [k] back1 cprog > [k] callme > 2.43%cprog cprog [k] hw_2_1 cprog > [k] address1 > 2.43%cprog cprog [k] sw_3_1_1 cprog > [k] sw_3_1 > 2.43%cprog cprog [k] sw_3_1_2 cprog > [k] sw_3_1 > 2.43%cprog cprog [k] sw_3_1_3 cprog > [k] sw_3_1 > 2.43%cprog cprog [k] sw_3_1 cprog > [k] sw_3_1_1 > 2.43%cprog cprog [k] sw_3_1 cprog > [k] sw_3_1_2 > 2.43%cprog cprog [k] sw_3_1 cprog > [k] sw_3_1_3 > 2.43%cprog cprog [k] callme cprog > [k] sw_3_1 > 2.43%cprog cprog [k] callme cprog > [k] sw_4_2 > 2.43%cprog cprog [k] hw_1_1 cprog > [k] symbol1 > 2.43%cprog cprog [k] callme cprog > [k] hw_1_1 > 2.42%cprog cprog [k] sw_3_1 cprog > [k] callme > 1.99%cprog cprog [k] success_3_1_1 cprog > [k] sw_3_1 > 1.99%cprog cprog [k] sw_3_1 cprog > [k] success_3_1_1 > 1.99%cprog cprog [k] address2 cprog > [k] back2 > 1.99%cprog cprog [k] hw_2_2 cprog > [k] address2 > 1.99%cprog cprog [k] back2 cprog > [k] callme > 1.99%cprog cprog [k] callme cprog > [k] main > 1.99%cprog cprog [k] sw_3_1 cprog > [k] success_3_1_3 > 1.99%cprog cprog [k] hw_1_1 cprog > [k] callme > 1.99%cprog cprog [k] sw_3_2 cprog > [k] callme > 1.99%cprog cprog [k] callme cprog > [k] sw_3_2 > 1.99%cprog cprog [k] success_3_1_2 cprog > [k] sw_3_1 > 1.99%cprog cprog [k] sw_3_1 cprog > [k] success_3_1_2 > 1.99%cprog cprog [k] hw_1_2 cprog > [k] callme > 1.99%cprog cprog [k] sw_4_1 cprog > [k]
Re: [PATCH V2 0/6] perf: New conditional branch filter
2013/8/30 Anshuman Khandual khand...@linux.vnet.ibm.com This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. I am trying to understand which HW has support for capturing the branches: PPC7 or PPC8. Then it seems you're saying that only PPC8 has the filtering support. On PPC7 you use the SW filter. Did I get this right? I will look at the patch set. (1) perf record -e branch-misses:u -b ./cprog # Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol # ... . . # 4.42%cprog cprog [k] sw_4_2 cprog [k] lr_addr 4.41%cprog cprog [k] symbol2cprog [k] hw_1_2 4.41%cprog cprog [k] ctr_addr cprog [k] sw_4_1 4.41%cprog cprog [k] lr_addrcprog [k] sw_4_2 4.41%cprog cprog [k] sw_4_2 cprog [k] callme 4.41%cprog cprog [k] symbol1cprog [k] hw_1_1 4.41%cprog cprog [k] success_3_1_3 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_4_1 cprog [k] ctr_addr 2.43%cprog cprog [k] hw_1_2 cprog [k] symbol2 2.43%cprog cprog [k] callme cprog [k] hw_1_2 2.43%cprog cprog [k] address1 cprog [k] back1 2.43%cprog cprog [k] back1 cprog [k] callme 2.43%cprog cprog [k] hw_2_1 cprog [k] address1 2.43%cprog cprog [k] sw_3_1_1 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1_2 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1_3 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_1 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_2 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_3 2.43%cprog cprog [k] callme cprog [k] sw_3_1 2.43%cprog cprog [k] callme cprog [k] sw_4_2 2.43%cprog cprog [k] hw_1_1 cprog [k] symbol1 2.43%cprog cprog [k] callme cprog [k] hw_1_1 2.42%cprog cprog [k] sw_3_1 cprog [k] callme 1.99%cprog cprog [k] success_3_1_1 cprog [k] sw_3_1 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_1 1.99%cprog cprog [k] address2 cprog [k] back2 1.99%cprog cprog [k] hw_2_2 cprog [k] address2 1.99%cprog cprog [k] back2 cprog [k] callme 1.99%cprog cprog [k] callme cprog [k] main 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_3 1.99%cprog cprog [k] hw_1_1 cprog [k] callme 1.99%cprog cprog [k] sw_3_2 cprog [k] callme 1.99%cprog cprog [k] callme cprog [k] sw_3_2 1.99%cprog cprog [k] success_3_1_2 cprog [k] sw_3_1 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_2 1.99%cprog cprog [k] hw_1_2 cprog [k] callme 1.99%cprog cprog [k] sw_4_1 cprog [k] callme 0.02%cprog [unknown] [k] 0xf7ba2328
[PATCH V2 0/6] perf: New conditional branch filter
This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. (1) perf record -e branch-misses:u -b ./cprog # Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol # ... . . # 4.42%cprog cprog [k] sw_4_2 cprog [k] lr_addr 4.41%cprog cprog [k] symbol2cprog [k] hw_1_2 4.41%cprog cprog [k] ctr_addr cprog [k] sw_4_1 4.41%cprog cprog [k] lr_addrcprog [k] sw_4_2 4.41%cprog cprog [k] sw_4_2 cprog [k] callme 4.41%cprog cprog [k] symbol1cprog [k] hw_1_1 4.41%cprog cprog [k] success_3_1_3 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_4_1 cprog [k] ctr_addr 2.43%cprog cprog [k] hw_1_2 cprog [k] symbol2 2.43%cprog cprog [k] callme cprog [k] hw_1_2 2.43%cprog cprog [k] address1 cprog [k] back1 2.43%cprog cprog [k] back1 cprog [k] callme 2.43%cprog cprog [k] hw_2_1 cprog [k] address1 2.43%cprog cprog [k] sw_3_1_1 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1_2 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1_3 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_1 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_2 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_3 2.43%cprog cprog [k] callme cprog [k] sw_3_1 2.43%cprog cprog [k] callme cprog [k] sw_4_2 2.43%cprog cprog [k] hw_1_1 cprog [k] symbol1 2.43%cprog cprog [k] callme cprog [k] hw_1_1 2.42%cprog cprog [k] sw_3_1 cprog [k] callme 1.99%cprog cprog [k] success_3_1_1 cprog [k] sw_3_1 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_1 1.99%cprog cprog [k] address2 cprog [k] back2 1.99%cprog cprog [k] hw_2_2 cprog [k] address2 1.99%cprog cprog [k] back2 cprog [k] callme 1.99%cprog cprog [k] callme cprog [k] main 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_3 1.99%cprog cprog [k] hw_1_1 cprog [k] callme 1.99%cprog cprog [k] sw_3_2 cprog [k] callme 1.99%cprog cprog [k] callme cprog [k] sw_3_2 1.99%cprog cprog [k] success_3_1_2 cprog [k] sw_3_1 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_2 1.99%cprog cprog [k] hw_1_2 cprog [k] callme 1.99%cprog cprog [k] sw_4_1 cprog [k] callme 0.02%cprog [unknown] [k] 0xf7ba2328
[PATCH V2 0/6] perf: New conditional branch filter
This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND filter. This patchset also enables SW based branch filtering support for PPC64 platforms which have branch stack sampling support. With this new enablement, the branch filter support for PPC64 platforms have been extended to include all these combinations discussed below with a sample test application program. (1) perf record -e branch-misses:u -b ./cprog # Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol # ... . . # 4.42%cprog cprog [k] sw_4_2 cprog [k] lr_addr 4.41%cprog cprog [k] symbol2cprog [k] hw_1_2 4.41%cprog cprog [k] ctr_addr cprog [k] sw_4_1 4.41%cprog cprog [k] lr_addrcprog [k] sw_4_2 4.41%cprog cprog [k] sw_4_2 cprog [k] callme 4.41%cprog cprog [k] symbol1cprog [k] hw_1_1 4.41%cprog cprog [k] success_3_1_3 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_4_1 cprog [k] ctr_addr 2.43%cprog cprog [k] hw_1_2 cprog [k] symbol2 2.43%cprog cprog [k] callme cprog [k] hw_1_2 2.43%cprog cprog [k] address1 cprog [k] back1 2.43%cprog cprog [k] back1 cprog [k] callme 2.43%cprog cprog [k] hw_2_1 cprog [k] address1 2.43%cprog cprog [k] sw_3_1_1 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1_2 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1_3 cprog [k] sw_3_1 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_1 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_2 2.43%cprog cprog [k] sw_3_1 cprog [k] sw_3_1_3 2.43%cprog cprog [k] callme cprog [k] sw_3_1 2.43%cprog cprog [k] callme cprog [k] sw_4_2 2.43%cprog cprog [k] hw_1_1 cprog [k] symbol1 2.43%cprog cprog [k] callme cprog [k] hw_1_1 2.42%cprog cprog [k] sw_3_1 cprog [k] callme 1.99%cprog cprog [k] success_3_1_1 cprog [k] sw_3_1 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_1 1.99%cprog cprog [k] address2 cprog [k] back2 1.99%cprog cprog [k] hw_2_2 cprog [k] address2 1.99%cprog cprog [k] back2 cprog [k] callme 1.99%cprog cprog [k] callme cprog [k] main 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_3 1.99%cprog cprog [k] hw_1_1 cprog [k] callme 1.99%cprog cprog [k] sw_3_2 cprog [k] callme 1.99%cprog cprog [k] callme cprog [k] sw_3_2 1.99%cprog cprog [k] success_3_1_2 cprog [k] sw_3_1 1.99%cprog cprog [k] sw_3_1 cprog [k] success_3_1_2 1.99%cprog cprog [k] hw_1_2 cprog [k] callme 1.99%cprog cprog [k] sw_4_1 cprog [k] callme 0.02%cprog [unknown] [k] 0xf7ba2328