Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Jin, Yao



Argh, fix your mailer. That is unreadable.

/me reflows...


Sorry about that. Now I reconfigure the mail editor by applying "Preformat" and 
"Fixed Width" settings in thunderbird client. Wish it to be better.


See, that's so much better..

Oh, so you _ARE_ adding a kernel feature? I understood you only wanted
to change perf-report.


Honestly it's a perf-report feature. But it needs kernel to record the branch 
type to perf_event_entry so there is a kernel patch for that in patch series.



WTH didn't you Cc the maintainers?


Very sorry not to cc to all maintainers in v1. I will be careful of sending v2 
patch series.


Also, if you do this, you need to Cc the PowerPC people, since they too
implement PERF_SAMPLE_BRANCH_ bits.



I will cc linuxppc-...@lists.ozlabs.org when sending v2.

Thanks
Jin Yao




Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Jin, Yao



Argh, fix your mailer. That is unreadable.

/me reflows...


Sorry about that. Now I reconfigure the mail editor by applying "Preformat" and 
"Fixed Width" settings in thunderbird client. Wish it to be better.


See, that's so much better..

Oh, so you _ARE_ adding a kernel feature? I understood you only wanted
to change perf-report.


Honestly it's a perf-report feature. But it needs kernel to record the branch 
type to perf_event_entry so there is a kernel patch for that in patch series.



WTH didn't you Cc the maintainers?


Very sorry not to cc to all maintainers in v1. I will be careful of sending v2 
patch series.


Also, if you do this, you need to Cc the PowerPC people, since they too
implement PERF_SAMPLE_BRANCH_ bits.



I will cc linuxppc-...@lists.ozlabs.org when sending v2.

Thanks
Jin Yao




Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 10:43:19PM +0800, Jin, Yao wrote:
> 
> 
> On 4/6/2017 5:25 PM, Peter Zijlstra wrote:
> > On Thu, Apr 06, 2017 at 04:21:06PM +0800, Jin, Yao wrote:
> > > Hi, otherwise we have to maintain 2 branch type copies between kernel and
> > > user-space.
> > > 
> > > For example, currently X86_BR_* are defined in lbr.c. To display the 
> > > branch
> > > type in user-space, the user-space has to maintain the same copy for
> > > X86_BR_*. I didn't get a better idea.
> > I still don't understand what you want; or why it would matter.
> > 
> > Those specific macros are for hardware LBR filter emulation/fixup. What
> > does that have to do with any userspace crud?
> 
> I just want to provide a new feature that the user can directly check branch
> type
> in perf report, instead of looking it up in the binary. Binary could be not
> available
> later, so it's possible that userspace can't get the branch type.
> 
> The X86_BR are generated when disassembling the branch instruction in
> kernel.
> They can be considered as the x86 branch types.
> 
> It's easy to let kernel return the x86 branch types to userspace, and then
> userspace
> shows the branch type in perf report.
> 
> While kernel and userspace have to maintain the X86_BR definitions. One copy
> is in
> kernel and the other copy is in userspace. To avoid the duplicate
> definitions , I define
> the common branch type in perf_event.h to share between kernel and
> userspace.
> That's why I do that.

Argh, fix your mailer. That is unreadable.

/me reflows...

> I just want to provide a new feature that the user can directly check
> branch type in perf report, instead of looking it up in the binary.
> Binary could be not available later, so it's possible that userspace
> can't get the branch type.
> 
> The X86_BR are generated when disassembling the branch instruction in
> kernel.  They can be considered as the x86 branch types.
> 
> It's easy to let kernel return the x86 branch types to userspace, and
> then userspace shows the branch type in perf report.
> 
> While kernel and userspace have to maintain the X86_BR definitions.
> One copy is in kernel and the other copy is in userspace. To avoid the
> duplicate definitions , I define the common branch type in
> perf_event.h to share between kernel and userspace.  That's why I do
> that.

See, that's so much better..

Oh, so you _ARE_ adding a kernel feature? I understood you only wanted
to change perf-report.

WTH didn't you Cc the maintainers?

Also, if you do this, you need to Cc the PowerPC people, since they too
implement PERF_SAMPLE_BRANCH_ bits.


Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 10:43:19PM +0800, Jin, Yao wrote:
> 
> 
> On 4/6/2017 5:25 PM, Peter Zijlstra wrote:
> > On Thu, Apr 06, 2017 at 04:21:06PM +0800, Jin, Yao wrote:
> > > Hi, otherwise we have to maintain 2 branch type copies between kernel and
> > > user-space.
> > > 
> > > For example, currently X86_BR_* are defined in lbr.c. To display the 
> > > branch
> > > type in user-space, the user-space has to maintain the same copy for
> > > X86_BR_*. I didn't get a better idea.
> > I still don't understand what you want; or why it would matter.
> > 
> > Those specific macros are for hardware LBR filter emulation/fixup. What
> > does that have to do with any userspace crud?
> 
> I just want to provide a new feature that the user can directly check branch
> type
> in perf report, instead of looking it up in the binary. Binary could be not
> available
> later, so it's possible that userspace can't get the branch type.
> 
> The X86_BR are generated when disassembling the branch instruction in
> kernel.
> They can be considered as the x86 branch types.
> 
> It's easy to let kernel return the x86 branch types to userspace, and then
> userspace
> shows the branch type in perf report.
> 
> While kernel and userspace have to maintain the X86_BR definitions. One copy
> is in
> kernel and the other copy is in userspace. To avoid the duplicate
> definitions , I define
> the common branch type in perf_event.h to share between kernel and
> userspace.
> That's why I do that.

Argh, fix your mailer. That is unreadable.

/me reflows...

> I just want to provide a new feature that the user can directly check
> branch type in perf report, instead of looking it up in the binary.
> Binary could be not available later, so it's possible that userspace
> can't get the branch type.
> 
> The X86_BR are generated when disassembling the branch instruction in
> kernel.  They can be considered as the x86 branch types.
> 
> It's easy to let kernel return the x86 branch types to userspace, and
> then userspace shows the branch type in perf report.
> 
> While kernel and userspace have to maintain the X86_BR definitions.
> One copy is in kernel and the other copy is in userspace. To avoid the
> duplicate definitions , I define the common branch type in
> perf_event.h to share between kernel and userspace.  That's why I do
> that.

See, that's so much better..

Oh, so you _ARE_ adding a kernel feature? I understood you only wanted
to change perf-report.

WTH didn't you Cc the maintainers?

Also, if you do this, you need to Cc the PowerPC people, since they too
implement PERF_SAMPLE_BRANCH_ bits.


Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Jin, Yao



On 4/6/2017 5:25 PM, Peter Zijlstra wrote:

On Thu, Apr 06, 2017 at 04:21:06PM +0800, Jin, Yao wrote:

Hi, otherwise we have to maintain 2 branch type copies between kernel and
user-space.

For example, currently X86_BR_* are defined in lbr.c. To display the branch
type in user-space, the user-space has to maintain the same copy for
X86_BR_*. I didn't get a better idea.

I still don't understand what you want; or why it would matter.

Those specific macros are for hardware LBR filter emulation/fixup. What
does that have to do with any userspace crud?


I just want to provide a new feature that the user can directly check 
branch type
in perf report, instead of looking it up in the binary. Binary could be 
not available

later, so it's possible that userspace can't get the branch type.

The X86_BR are generated when disassembling the branch instruction in 
kernel.

They can be considered as the x86 branch types.

It's easy to let kernel return the x86 branch types to userspace, and 
then userspace

shows the branch type in perf report.

While kernel and userspace have to maintain the X86_BR definitions. One 
copy is in
kernel and the other copy is in userspace. To avoid the duplicate 
definitions , I define
the common branch type in perf_event.h to share between kernel and 
userspace.

That's why I do that.

Thanks
Jin Yao



Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Jin, Yao



On 4/6/2017 5:25 PM, Peter Zijlstra wrote:

On Thu, Apr 06, 2017 at 04:21:06PM +0800, Jin, Yao wrote:

Hi, otherwise we have to maintain 2 branch type copies between kernel and
user-space.

For example, currently X86_BR_* are defined in lbr.c. To display the branch
type in user-space, the user-space has to maintain the same copy for
X86_BR_*. I didn't get a better idea.

I still don't understand what you want; or why it would matter.

Those specific macros are for hardware LBR filter emulation/fixup. What
does that have to do with any userspace crud?


I just want to provide a new feature that the user can directly check 
branch type
in perf report, instead of looking it up in the binary. Binary could be 
not available

later, so it's possible that userspace can't get the branch type.

The X86_BR are generated when disassembling the branch instruction in 
kernel.

They can be considered as the x86 branch types.

It's easy to let kernel return the x86 branch types to userspace, and 
then userspace

shows the branch type in perf report.

While kernel and userspace have to maintain the X86_BR definitions. One 
copy is in
kernel and the other copy is in userspace. To avoid the duplicate 
definitions , I define
the common branch type in perf_event.h to share between kernel and 
userspace.

That's why I do that.

Thanks
Jin Yao



Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 04:21:06PM +0800, Jin, Yao wrote:
> Hi, otherwise we have to maintain 2 branch type copies between kernel and
> user-space.
> 
> For example, currently X86_BR_* are defined in lbr.c. To display the branch
> type in user-space, the user-space has to maintain the same copy for
> X86_BR_*. I didn't get a better idea.

I still don't understand what you want; or why it would matter.

Those specific macros are for hardware LBR filter emulation/fixup. What
does that have to do with any userspace crud?


Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 04:21:06PM +0800, Jin, Yao wrote:
> Hi, otherwise we have to maintain 2 branch type copies between kernel and
> user-space.
> 
> For example, currently X86_BR_* are defined in lbr.c. To display the branch
> type in user-space, the user-space has to maintain the same copy for
> X86_BR_*. I didn't get a better idea.

I still don't understand what you want; or why it would matter.

Those specific macros are for hardware LBR filter emulation/fixup. What
does that have to do with any userspace crud?


Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Jin, Yao



On 4/6/2017 2:58 PM, Peter Zijlstra wrote:

On Tue, Apr 04, 2017 at 11:18:05AM -0300, Arnaldo Carvalho de Melo wrote:

Adding the perf kernel maintainers to the CC list.

Thanks.


Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:

It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

I don't get it. Why is the kernel interface mucked with for a user-space
feature?

That's wrong.
Hi, otherwise we have to maintain 2 branch type copies between kernel 
and user-space.


For example, currently X86_BR_* are defined in lbr.c. To display the 
branch type in user-space, the user-space has to maintain the same copy 
for X86_BR_*. I didn't get a better idea.


Thanks
Jin Yao



Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Jin, Yao



On 4/6/2017 2:58 PM, Peter Zijlstra wrote:

On Tue, Apr 04, 2017 at 11:18:05AM -0300, Arnaldo Carvalho de Melo wrote:

Adding the perf kernel maintainers to the CC list.

Thanks.


Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:

It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

I don't get it. Why is the kernel interface mucked with for a user-space
feature?

That's wrong.
Hi, otherwise we have to maintain 2 branch type copies between kernel 
and user-space.


For example, currently X86_BR_* are defined in lbr.c. To display the 
branch type in user-space, the user-space has to maintain the same copy 
for X86_BR_*. I didn't get a better idea.


Thanks
Jin Yao



Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Peter Zijlstra
On Tue, Apr 04, 2017 at 11:18:05AM -0300, Arnaldo Carvalho de Melo wrote:
> Adding the perf kernel maintainers to the CC list.

Thanks.

> Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
> > It is often useful to know the branch types while analyzing branch
> > data. For example, a call is very different from a conditional branch.
> > 
> > Currently we have to look it up in binary while the binary may later
> > not be available and even the binary is available but user has to take
> > some time. It is very useful for user to check it directly in perf
> > report.
> > 
> > Perf already has support for disassembling the branch instruction
> > to get the branch type. The branch type is defined in lbr.c.
> > 
> > To keep consistent on kernel and userspace and make the classification
> > more common, the patch adds the common branch type classification
> > in perf_event.h.
> > 
> > Since the disassembling of branch instruction needs some overhead,
> > a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> > needs to disassemble the branch instruction and record the branch
> > type.

I don't get it. Why is the kernel interface mucked with for a user-space
feature?

That's wrong.


Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-06 Thread Peter Zijlstra
On Tue, Apr 04, 2017 at 11:18:05AM -0300, Arnaldo Carvalho de Melo wrote:
> Adding the perf kernel maintainers to the CC list.

Thanks.

> Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
> > It is often useful to know the branch types while analyzing branch
> > data. For example, a call is very different from a conditional branch.
> > 
> > Currently we have to look it up in binary while the binary may later
> > not be available and even the binary is available but user has to take
> > some time. It is very useful for user to check it directly in perf
> > report.
> > 
> > Perf already has support for disassembling the branch instruction
> > to get the branch type. The branch type is defined in lbr.c.
> > 
> > To keep consistent on kernel and userspace and make the classification
> > more common, the patch adds the common branch type classification
> > in perf_event.h.
> > 
> > Since the disassembling of branch instruction needs some overhead,
> > a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> > needs to disassemble the branch instruction and record the branch
> > type.

I don't get it. Why is the kernel interface mucked with for a user-space
feature?

That's wrong.


Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-05 Thread Jin, Yao



On 4/5/2017 12:09 AM, Arnaldo Carvalho de Melo wrote:

Em Tue, Apr 04, 2017 at 11:52:53PM +0800, Jin, Yao escreveu:


On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:

Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:

It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

Signed-off-by: Jin Yao 
---
   include/uapi/linux/perf_event.h   | 24 +++-
   tools/include/uapi/linux/perf_event.h | 24 +++-
   2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
+   PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT  = 16, /* save branch type */
+
PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
   };
@@ -198,9 +200,27 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
+   PERF_SAMPLE_BRANCH_TYPE_SAVE=
+   1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
+
PERF_SAMPLE_BRANCH_MAX  = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
   };
+/*
+ * Common flow change classification
+ */
+enum {
+   PERF_BR_NONE= 0,/* unknown */
+   PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */
+   PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */
+   PERF_BR_JMP = 1 << 3, /* jump */
+   PERF_BR_IND_JMP = 1 << 4, /* indirect jump */
+   PERF_BR_CALL= 1 << 5, /* call */
+   PERF_BR_IND_CALL= 1 << 6, /* indirect call */
+   PERF_BR_RET = 1 << 7, /* return */
+   PERF_BR_FAR_BRANCH  = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */

Humm, wouldn't be better to have those in separate buckets? I.e.

PERF_BR_SYSCALL
PERF_BR_SYSRET,
PERF_BR_IRQ

etc?

There are also other types. I.e. abort, . I use FAR_BRANCH is because I
just want to differentiate between basic branch types and others. (others
may be too much and platform specific).

I understand that this is what you need right now, but "syscall",
"sysret", "irq", look generic enough, no?

Really, really arch specific stuff could indeed be lumped together in a
FAR_BRANCH, but those used as an example, above (/*
SYSCALL,SYSRET,IRQ,... */) seems potentially useful to have untangled?


After considerations, yes, you are right. I will fix this in v2.

Thanks
Jin Yao


And why a bitmask? /me reads a bit more...  couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE= 0,/* unknown */
PERF_BR_JCC_FWD = 1,/* conditional forward jump */
PERF_BR_JCC_BWD = 2,/* conditional backward jump */
PERF_BR_JMP = 3,/* jump */
PERF_BR_IND_JMP = 4,/* indirect jump */
PERF_BR_CALL= 5,/* call */
PERF_BR_IND_CALL= 6,/* indirect call */
PERF_BR_RET = 7,/* return */
PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo

You are right. I made things more complicated. Yes, the definitions should
be clear and simple. I will redefine them in v2.

Thanks, I wasn't missing anything, uff :-)
  

Thanks
Jin Yao


+};
+
   #define PERF_SAMPLE_BRANCH_PLM_ALL \
(PERF_SAMPLE_BRANCH_USER|\
 PERF_SAMPLE_BRANCH_KERNEL|\
@@ -999,6 +1019,7 @@ union perf_mem_data_src {
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
*cycles: cycles from last branch (or 0 if not supported)
+ * 

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-05 Thread Jin, Yao



On 4/5/2017 12:09 AM, Arnaldo Carvalho de Melo wrote:

Em Tue, Apr 04, 2017 at 11:52:53PM +0800, Jin, Yao escreveu:


On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:

Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:

It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

Signed-off-by: Jin Yao 
---
   include/uapi/linux/perf_event.h   | 24 +++-
   tools/include/uapi/linux/perf_event.h | 24 +++-
   2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
+   PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT  = 16, /* save branch type */
+
PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
   };
@@ -198,9 +200,27 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
+   PERF_SAMPLE_BRANCH_TYPE_SAVE=
+   1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
+
PERF_SAMPLE_BRANCH_MAX  = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
   };
+/*
+ * Common flow change classification
+ */
+enum {
+   PERF_BR_NONE= 0,/* unknown */
+   PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */
+   PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */
+   PERF_BR_JMP = 1 << 3, /* jump */
+   PERF_BR_IND_JMP = 1 << 4, /* indirect jump */
+   PERF_BR_CALL= 1 << 5, /* call */
+   PERF_BR_IND_CALL= 1 << 6, /* indirect call */
+   PERF_BR_RET = 1 << 7, /* return */
+   PERF_BR_FAR_BRANCH  = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */

Humm, wouldn't be better to have those in separate buckets? I.e.

PERF_BR_SYSCALL
PERF_BR_SYSRET,
PERF_BR_IRQ

etc?

There are also other types. I.e. abort, . I use FAR_BRANCH is because I
just want to differentiate between basic branch types and others. (others
may be too much and platform specific).

I understand that this is what you need right now, but "syscall",
"sysret", "irq", look generic enough, no?

Really, really arch specific stuff could indeed be lumped together in a
FAR_BRANCH, but those used as an example, above (/*
SYSCALL,SYSRET,IRQ,... */) seems potentially useful to have untangled?


After considerations, yes, you are right. I will fix this in v2.

Thanks
Jin Yao


And why a bitmask? /me reads a bit more...  couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE= 0,/* unknown */
PERF_BR_JCC_FWD = 1,/* conditional forward jump */
PERF_BR_JCC_BWD = 2,/* conditional backward jump */
PERF_BR_JMP = 3,/* jump */
PERF_BR_IND_JMP = 4,/* indirect jump */
PERF_BR_CALL= 5,/* call */
PERF_BR_IND_CALL= 6,/* indirect call */
PERF_BR_RET = 7,/* return */
PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo

You are right. I made things more complicated. Yes, the definitions should
be clear and simple. I will redefine them in v2.

Thanks, I wasn't missing anything, uff :-)
  

Thanks
Jin Yao


+};
+
   #define PERF_SAMPLE_BRANCH_PLM_ALL \
(PERF_SAMPLE_BRANCH_USER|\
 PERF_SAMPLE_BRANCH_KERNEL|\
@@ -999,6 +1019,7 @@ union perf_mem_data_src {
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
*cycles: cycles from last branch (or 0 if not supported)
+ *  type: branch type
  

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-04 Thread Arnaldo Carvalho de Melo
Em Tue, Apr 04, 2017 at 11:52:53PM +0800, Jin, Yao escreveu:
> 
> 
> On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:
> > Adding the perf kernel maintainers to the CC list.
> > 
> > Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
> > > It is often useful to know the branch types while analyzing branch
> > > data. For example, a call is very different from a conditional branch.
> > > 
> > > Currently we have to look it up in binary while the binary may later
> > > not be available and even the binary is available but user has to take
> > > some time. It is very useful for user to check it directly in perf
> > > report.
> > > 
> > > Perf already has support for disassembling the branch instruction
> > > to get the branch type. The branch type is defined in lbr.c.
> > > 
> > > To keep consistent on kernel and userspace and make the classification
> > > more common, the patch adds the common branch type classification
> > > in perf_event.h.
> > > 
> > > Since the disassembling of branch instruction needs some overhead,
> > > a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> > > needs to disassemble the branch instruction and record the branch
> > > type.
> > > 
> > > Signed-off-by: Jin Yao 
> > > ---
> > >   include/uapi/linux/perf_event.h   | 24 +++-
> > >   tools/include/uapi/linux/perf_event.h | 24 +++-
> > >   2 files changed, 46 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/uapi/linux/perf_event.h 
> > > b/include/uapi/linux/perf_event.h
> > > index d09a9cd..4d731fd 100644
> > > --- a/include/uapi/linux/perf_event.h
> > > +++ b/include/uapi/linux/perf_event.h
> > > @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
> > >   PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
> > >   PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
> > > + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT  = 16, /* save branch type */
> > > +
> > >   PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
> > >   };
> > > @@ -198,9 +200,27 @@ enum perf_branch_sample_type {
> > >   PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
> > > PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
> > >   PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
> > > PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
> > > + PERF_SAMPLE_BRANCH_TYPE_SAVE=
> > > + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
> > > +
> > >   PERF_SAMPLE_BRANCH_MAX  = 1U << 
> > > PERF_SAMPLE_BRANCH_MAX_SHIFT,
> > >   };
> > > +/*
> > > + * Common flow change classification
> > > + */
> > > +enum {
> > > + PERF_BR_NONE= 0,/* unknown */
> > > + PERF_BR_JCC_FWD = 1 << 1,   /* conditional forward jump */
> > > + PERF_BR_JCC_BWD = 1 << 2,   /* conditional backward jump */
> > > + PERF_BR_JMP = 1 << 3,   /* jump */
> > > + PERF_BR_IND_JMP = 1 << 4,   /* indirect jump */
> > > + PERF_BR_CALL= 1 << 5,   /* call */
> > > + PERF_BR_IND_CALL= 1 << 6,   /* indirect call */
> > > + PERF_BR_RET = 1 << 7,   /* return */
> > > + PERF_BR_FAR_BRANCH  = 1 << 8,   /* SYSCALL,SYSRET,IRQ,... */
> > Humm, wouldn't be better to have those in separate buckets? I.e.
> > 
> >PERF_BR_SYSCALL
> >PERF_BR_SYSRET,
> >PERF_BR_IRQ
> > 
> > etc?
> 
> There are also other types. I.e. abort, . I use FAR_BRANCH is because I
> just want to differentiate between basic branch types and others. (others
> may be too much and platform specific).

I understand that this is what you need right now, but "syscall",
"sysret", "irq", look generic enough, no?

Really, really arch specific stuff could indeed be lumped together in a
FAR_BRANCH, but those used as an example, above (/*
SYSCALL,SYSRET,IRQ,... */) seems potentially useful to have untangled?

> > And why a bitmask? /me reads a bit more...  couldn't find a reason for
> > this:
> > 
> > + type:9, /* branch type */
> > 
> > Do you have a reason to use 9 bits? Why not just:
> > 
> > enum {
> > PERF_BR_NONE= 0,/* unknown */
> > PERF_BR_JCC_FWD = 1,/* conditional forward jump */
> > PERF_BR_JCC_BWD = 2,/* conditional backward jump */
> > PERF_BR_JMP = 3,/* jump */
> > PERF_BR_IND_JMP = 4,/* indirect jump */
> > PERF_BR_CALL= 5,/* call */
> > PERF_BR_IND_CALL= 6,/* indirect call */
> > PERF_BR_RET = 7,/* return */
> > PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */
> > 
> > And then use, say, 4 or 5 bits for that type field?
> > 
> > I must be missing something trivial ;-\
> > 
> > - Arnaldo
> 
> You are right. I made things more complicated. Yes, the definitions should
> be clear and simple. I will redefine them in v2.

Thanks, I wasn't missing anything, uff :-)
 
> Thanks
> Jin Yao
> 
> > 

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-04 Thread Arnaldo Carvalho de Melo
Em Tue, Apr 04, 2017 at 11:52:53PM +0800, Jin, Yao escreveu:
> 
> 
> On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:
> > Adding the perf kernel maintainers to the CC list.
> > 
> > Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
> > > It is often useful to know the branch types while analyzing branch
> > > data. For example, a call is very different from a conditional branch.
> > > 
> > > Currently we have to look it up in binary while the binary may later
> > > not be available and even the binary is available but user has to take
> > > some time. It is very useful for user to check it directly in perf
> > > report.
> > > 
> > > Perf already has support for disassembling the branch instruction
> > > to get the branch type. The branch type is defined in lbr.c.
> > > 
> > > To keep consistent on kernel and userspace and make the classification
> > > more common, the patch adds the common branch type classification
> > > in perf_event.h.
> > > 
> > > Since the disassembling of branch instruction needs some overhead,
> > > a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> > > needs to disassemble the branch instruction and record the branch
> > > type.
> > > 
> > > Signed-off-by: Jin Yao 
> > > ---
> > >   include/uapi/linux/perf_event.h   | 24 +++-
> > >   tools/include/uapi/linux/perf_event.h | 24 +++-
> > >   2 files changed, 46 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/uapi/linux/perf_event.h 
> > > b/include/uapi/linux/perf_event.h
> > > index d09a9cd..4d731fd 100644
> > > --- a/include/uapi/linux/perf_event.h
> > > +++ b/include/uapi/linux/perf_event.h
> > > @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
> > >   PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
> > >   PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
> > > + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT  = 16, /* save branch type */
> > > +
> > >   PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
> > >   };
> > > @@ -198,9 +200,27 @@ enum perf_branch_sample_type {
> > >   PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
> > > PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
> > >   PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
> > > PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
> > > + PERF_SAMPLE_BRANCH_TYPE_SAVE=
> > > + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
> > > +
> > >   PERF_SAMPLE_BRANCH_MAX  = 1U << 
> > > PERF_SAMPLE_BRANCH_MAX_SHIFT,
> > >   };
> > > +/*
> > > + * Common flow change classification
> > > + */
> > > +enum {
> > > + PERF_BR_NONE= 0,/* unknown */
> > > + PERF_BR_JCC_FWD = 1 << 1,   /* conditional forward jump */
> > > + PERF_BR_JCC_BWD = 1 << 2,   /* conditional backward jump */
> > > + PERF_BR_JMP = 1 << 3,   /* jump */
> > > + PERF_BR_IND_JMP = 1 << 4,   /* indirect jump */
> > > + PERF_BR_CALL= 1 << 5,   /* call */
> > > + PERF_BR_IND_CALL= 1 << 6,   /* indirect call */
> > > + PERF_BR_RET = 1 << 7,   /* return */
> > > + PERF_BR_FAR_BRANCH  = 1 << 8,   /* SYSCALL,SYSRET,IRQ,... */
> > Humm, wouldn't be better to have those in separate buckets? I.e.
> > 
> >PERF_BR_SYSCALL
> >PERF_BR_SYSRET,
> >PERF_BR_IRQ
> > 
> > etc?
> 
> There are also other types. I.e. abort, . I use FAR_BRANCH is because I
> just want to differentiate between basic branch types and others. (others
> may be too much and platform specific).

I understand that this is what you need right now, but "syscall",
"sysret", "irq", look generic enough, no?

Really, really arch specific stuff could indeed be lumped together in a
FAR_BRANCH, but those used as an example, above (/*
SYSCALL,SYSRET,IRQ,... */) seems potentially useful to have untangled?

> > And why a bitmask? /me reads a bit more...  couldn't find a reason for
> > this:
> > 
> > + type:9, /* branch type */
> > 
> > Do you have a reason to use 9 bits? Why not just:
> > 
> > enum {
> > PERF_BR_NONE= 0,/* unknown */
> > PERF_BR_JCC_FWD = 1,/* conditional forward jump */
> > PERF_BR_JCC_BWD = 2,/* conditional backward jump */
> > PERF_BR_JMP = 3,/* jump */
> > PERF_BR_IND_JMP = 4,/* indirect jump */
> > PERF_BR_CALL= 5,/* call */
> > PERF_BR_IND_CALL= 6,/* indirect call */
> > PERF_BR_RET = 7,/* return */
> > PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */
> > 
> > And then use, say, 4 or 5 bits for that type field?
> > 
> > I must be missing something trivial ;-\
> > 
> > - Arnaldo
> 
> You are right. I made things more complicated. Yes, the definitions should
> be clear and simple. I will redefine them in v2.

Thanks, I wasn't missing anything, uff :-)
 
> Thanks
> Jin Yao
> 
> > 
> > > +};
> > > +
> > >   

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-04 Thread Jin, Yao



On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:

Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:

It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

Signed-off-by: Jin Yao 
---
  include/uapi/linux/perf_event.h   | 24 +++-
  tools/include/uapi/linux/perf_event.h | 24 +++-
  2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
  
+	PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT	= 16, /* save branch type */

+
PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
  };
  
@@ -198,9 +200,27 @@ enum perf_branch_sample_type {

PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
  
+	PERF_SAMPLE_BRANCH_TYPE_SAVE	=

+   1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
+
PERF_SAMPLE_BRANCH_MAX  = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
  };
  
+/*

+ * Common flow change classification
+ */
+enum {
+   PERF_BR_NONE= 0,/* unknown */
+   PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */
+   PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */
+   PERF_BR_JMP = 1 << 3, /* jump */
+   PERF_BR_IND_JMP = 1 << 4, /* indirect jump */
+   PERF_BR_CALL= 1 << 5, /* call */
+   PERF_BR_IND_CALL= 1 << 6, /* indirect call */
+   PERF_BR_RET = 1 << 7, /* return */
+   PERF_BR_FAR_BRANCH  = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */

Humm, wouldn't be better to have those in separate buckets? I.e.

   PERF_BR_SYSCALL
   PERF_BR_SYSRET,
   PERF_BR_IRQ

etc?


There are also other types. I.e. abort, . I use FAR_BRANCH is 
because I just want to differentiate between basic branch types and 
others. (others may be too much and platform specific).




And why a bitmask? /me reads a bit more...  couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE= 0,/* unknown */
PERF_BR_JCC_FWD = 1,/* conditional forward jump */
PERF_BR_JCC_BWD = 2,/* conditional backward jump */
PERF_BR_JMP = 3,/* jump */
PERF_BR_IND_JMP = 4,/* indirect jump */
PERF_BR_CALL= 5,/* call */
PERF_BR_IND_CALL= 6,/* indirect call */
PERF_BR_RET = 7,/* return */
PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo


You are right. I made things more complicated. Yes, the definitions 
should be clear and simple. I will redefine them in v2.


Thanks
Jin Yao




+};
+
  #define PERF_SAMPLE_BRANCH_PLM_ALL \
(PERF_SAMPLE_BRANCH_USER|\
 PERF_SAMPLE_BRANCH_KERNEL|\
@@ -999,6 +1019,7 @@ union perf_mem_data_src {
   * in_tx: running in a hardware transaction
   * abort: aborting a hardware transaction
   *cycles: cycles from last branch (or 0 if not supported)
+ *  type: branch type
   */
  struct perf_branch_entry {
__u64   from;
@@ -1008,7 +1029,8 @@ struct perf_branch_entry {
in_tx:1,/* in transaction */
abort:1,/* transaction abort */
cycles:16,  /* cycle count to last branch */
-   reserved:44;
+   type:9, /* branch type */
+   reserved:35;
  };
  
  #endif /* _UAPI_LINUX_PERF_EVENT_H */

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-04 Thread Jin, Yao



On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:

Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:

It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

Signed-off-by: Jin Yao 
---
  include/uapi/linux/perf_event.h   | 24 +++-
  tools/include/uapi/linux/perf_event.h | 24 +++-
  2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
  
+	PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT	= 16, /* save branch type */

+
PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
  };
  
@@ -198,9 +200,27 @@ enum perf_branch_sample_type {

PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
  
+	PERF_SAMPLE_BRANCH_TYPE_SAVE	=

+   1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
+
PERF_SAMPLE_BRANCH_MAX  = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
  };
  
+/*

+ * Common flow change classification
+ */
+enum {
+   PERF_BR_NONE= 0,/* unknown */
+   PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */
+   PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */
+   PERF_BR_JMP = 1 << 3, /* jump */
+   PERF_BR_IND_JMP = 1 << 4, /* indirect jump */
+   PERF_BR_CALL= 1 << 5, /* call */
+   PERF_BR_IND_CALL= 1 << 6, /* indirect call */
+   PERF_BR_RET = 1 << 7, /* return */
+   PERF_BR_FAR_BRANCH  = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */

Humm, wouldn't be better to have those in separate buckets? I.e.

   PERF_BR_SYSCALL
   PERF_BR_SYSRET,
   PERF_BR_IRQ

etc?


There are also other types. I.e. abort, . I use FAR_BRANCH is 
because I just want to differentiate between basic branch types and 
others. (others may be too much and platform specific).




And why a bitmask? /me reads a bit more...  couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE= 0,/* unknown */
PERF_BR_JCC_FWD = 1,/* conditional forward jump */
PERF_BR_JCC_BWD = 2,/* conditional backward jump */
PERF_BR_JMP = 3,/* jump */
PERF_BR_IND_JMP = 4,/* indirect jump */
PERF_BR_CALL= 5,/* call */
PERF_BR_IND_CALL= 6,/* indirect call */
PERF_BR_RET = 7,/* return */
PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo


You are right. I made things more complicated. Yes, the definitions 
should be clear and simple. I will redefine them in v2.


Thanks
Jin Yao




+};
+
  #define PERF_SAMPLE_BRANCH_PLM_ALL \
(PERF_SAMPLE_BRANCH_USER|\
 PERF_SAMPLE_BRANCH_KERNEL|\
@@ -999,6 +1019,7 @@ union perf_mem_data_src {
   * in_tx: running in a hardware transaction
   * abort: aborting a hardware transaction
   *cycles: cycles from last branch (or 0 if not supported)
+ *  type: branch type
   */
  struct perf_branch_entry {
__u64   from;
@@ -1008,7 +1029,8 @@ struct perf_branch_entry {
in_tx:1,/* in transaction */
abort:1,/* transaction abort */
cycles:16,  /* cycle count to last branch */
-   reserved:44;
+   type:9, /* branch type */
+   reserved:35;
  };
  
  #endif /* _UAPI_LINUX_PERF_EVENT_H */

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- 

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-04 Thread Arnaldo Carvalho de Melo
Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
> It is often useful to know the branch types while analyzing branch
> data. For example, a call is very different from a conditional branch.
> 
> Currently we have to look it up in binary while the binary may later
> not be available and even the binary is available but user has to take
> some time. It is very useful for user to check it directly in perf
> report.
> 
> Perf already has support for disassembling the branch instruction
> to get the branch type. The branch type is defined in lbr.c.
> 
> To keep consistent on kernel and userspace and make the classification
> more common, the patch adds the common branch type classification
> in perf_event.h.
> 
> Since the disassembling of branch instruction needs some overhead,
> a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> needs to disassemble the branch instruction and record the branch
> type.
> 
> Signed-off-by: Jin Yao 
> ---
>  include/uapi/linux/perf_event.h   | 24 +++-
>  tools/include/uapi/linux/perf_event.h | 24 +++-
>  2 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index d09a9cd..4d731fd 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
>   PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
>   PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
>  
> + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT  = 16, /* save branch type */
> +
>   PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
>  };
>  
> @@ -198,9 +200,27 @@ enum perf_branch_sample_type {
>   PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
> PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
>   PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
> PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
>  
> + PERF_SAMPLE_BRANCH_TYPE_SAVE=
> + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
> +
>   PERF_SAMPLE_BRANCH_MAX  = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
>  };
>  
> +/*
> + * Common flow change classification
> + */
> +enum {
> + PERF_BR_NONE= 0,/* unknown */
> + PERF_BR_JCC_FWD = 1 << 1,   /* conditional forward jump */
> + PERF_BR_JCC_BWD = 1 << 2,   /* conditional backward jump */
> + PERF_BR_JMP = 1 << 3,   /* jump */
> + PERF_BR_IND_JMP = 1 << 4,   /* indirect jump */
> + PERF_BR_CALL= 1 << 5,   /* call */
> + PERF_BR_IND_CALL= 1 << 6,   /* indirect call */
> + PERF_BR_RET = 1 << 7,   /* return */
> + PERF_BR_FAR_BRANCH  = 1 << 8,   /* SYSCALL,SYSRET,IRQ,... */

Humm, wouldn't be better to have those in separate buckets? I.e.

  PERF_BR_SYSCALL
  PERF_BR_SYSRET,
  PERF_BR_IRQ

etc?

And why a bitmask? /me reads a bit more...  couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE= 0,/* unknown */
PERF_BR_JCC_FWD = 1,/* conditional forward jump */
PERF_BR_JCC_BWD = 2,/* conditional backward jump */
PERF_BR_JMP = 3,/* jump */
PERF_BR_IND_JMP = 4,/* indirect jump */
PERF_BR_CALL= 5,/* call */
PERF_BR_IND_CALL= 6,/* indirect call */
PERF_BR_RET = 7,/* return */
PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo

> +};
> +
>  #define PERF_SAMPLE_BRANCH_PLM_ALL \
>   (PERF_SAMPLE_BRANCH_USER|\
>PERF_SAMPLE_BRANCH_KERNEL|\
> @@ -999,6 +1019,7 @@ union perf_mem_data_src {
>   * in_tx: running in a hardware transaction
>   * abort: aborting a hardware transaction
>   *cycles: cycles from last branch (or 0 if not supported)
> + *  type: branch type
>   */
>  struct perf_branch_entry {
>   __u64   from;
> @@ -1008,7 +1029,8 @@ struct perf_branch_entry {
>   in_tx:1,/* in transaction */
>   abort:1,/* transaction abort */
>   cycles:16,  /* cycle count to last branch */
> - reserved:44;
> + type:9, /* branch type */
> + reserved:35;
>  };
>  
>  #endif /* _UAPI_LINUX_PERF_EVENT_H */
> diff --git a/tools/include/uapi/linux/perf_event.h 
> b/tools/include/uapi/linux/perf_event.h
> index d09a9cd..4d731fd 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
>   PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
>   

Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

2017-04-04 Thread Arnaldo Carvalho de Melo
Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
> It is often useful to know the branch types while analyzing branch
> data. For example, a call is very different from a conditional branch.
> 
> Currently we have to look it up in binary while the binary may later
> not be available and even the binary is available but user has to take
> some time. It is very useful for user to check it directly in perf
> report.
> 
> Perf already has support for disassembling the branch instruction
> to get the branch type. The branch type is defined in lbr.c.
> 
> To keep consistent on kernel and userspace and make the classification
> more common, the patch adds the common branch type classification
> in perf_event.h.
> 
> Since the disassembling of branch instruction needs some overhead,
> a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> needs to disassemble the branch instruction and record the branch
> type.
> 
> Signed-off-by: Jin Yao 
> ---
>  include/uapi/linux/perf_event.h   | 24 +++-
>  tools/include/uapi/linux/perf_event.h | 24 +++-
>  2 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index d09a9cd..4d731fd 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
>   PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
>   PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT  = 15, /* no cycles */
>  
> + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT  = 16, /* save branch type */
> +
>   PERF_SAMPLE_BRANCH_MAX_SHIFT/* non-ABI */
>  };
>  
> @@ -198,9 +200,27 @@ enum perf_branch_sample_type {
>   PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 
> PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
>   PERF_SAMPLE_BRANCH_NO_CYCLES= 1U << 
> PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
>  
> + PERF_SAMPLE_BRANCH_TYPE_SAVE=
> + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
> +
>   PERF_SAMPLE_BRANCH_MAX  = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
>  };
>  
> +/*
> + * Common flow change classification
> + */
> +enum {
> + PERF_BR_NONE= 0,/* unknown */
> + PERF_BR_JCC_FWD = 1 << 1,   /* conditional forward jump */
> + PERF_BR_JCC_BWD = 1 << 2,   /* conditional backward jump */
> + PERF_BR_JMP = 1 << 3,   /* jump */
> + PERF_BR_IND_JMP = 1 << 4,   /* indirect jump */
> + PERF_BR_CALL= 1 << 5,   /* call */
> + PERF_BR_IND_CALL= 1 << 6,   /* indirect call */
> + PERF_BR_RET = 1 << 7,   /* return */
> + PERF_BR_FAR_BRANCH  = 1 << 8,   /* SYSCALL,SYSRET,IRQ,... */

Humm, wouldn't be better to have those in separate buckets? I.e.

  PERF_BR_SYSCALL
  PERF_BR_SYSRET,
  PERF_BR_IRQ

etc?

And why a bitmask? /me reads a bit more...  couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE= 0,/* unknown */
PERF_BR_JCC_FWD = 1,/* conditional forward jump */
PERF_BR_JCC_BWD = 2,/* conditional backward jump */
PERF_BR_JMP = 3,/* jump */
PERF_BR_IND_JMP = 4,/* indirect jump */
PERF_BR_CALL= 5,/* call */
PERF_BR_IND_CALL= 6,/* indirect call */
PERF_BR_RET = 7,/* return */
PERF_BR_FAR_BRANCH  = 8,/* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo

> +};
> +
>  #define PERF_SAMPLE_BRANCH_PLM_ALL \
>   (PERF_SAMPLE_BRANCH_USER|\
>PERF_SAMPLE_BRANCH_KERNEL|\
> @@ -999,6 +1019,7 @@ union perf_mem_data_src {
>   * in_tx: running in a hardware transaction
>   * abort: aborting a hardware transaction
>   *cycles: cycles from last branch (or 0 if not supported)
> + *  type: branch type
>   */
>  struct perf_branch_entry {
>   __u64   from;
> @@ -1008,7 +1029,8 @@ struct perf_branch_entry {
>   in_tx:1,/* in transaction */
>   abort:1,/* transaction abort */
>   cycles:16,  /* cycle count to last branch */
> - reserved:44;
> + type:9, /* branch type */
> + reserved:35;
>  };
>  
>  #endif /* _UAPI_LINUX_PERF_EVENT_H */
> diff --git a/tools/include/uapi/linux/perf_event.h 
> b/tools/include/uapi/linux/perf_event.h
> index d09a9cd..4d731fd 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
>   PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT   = 14, /* no flags */
>