[PATCH] perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data

2015-12-06 Thread Ravi Bangoria
While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.

Reason behind this failure is, when it tries to fetch machine from rb_tree
of machines, it fails. As a part of tracing a bug, we figured out that this
code was incorrectly refactored in commit 
54245fdc357613633954bfd38cffb71cb9def067
("perf session: Remove wrappers to machines__find")

This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.

This patch is generated from acme perf/core branch.

Below I've mention an example that demonstrate the behaviour before and
after applying patch.

Before applying patch:
[Note: One needs to run guest before recording data in host]

ravi@ravi-bangoria:~$ ./perf kvm record -a
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
[ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]

ravi@ravi-bangoria:~$ ./perf kvm report --stdio
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 285  of event 'cycles'
# Event count (approx.): 88715406
#
# Overhead  Command  Shared Object  Symbol
#   ...  .  ..
#

#
# (For a higher level overview, try: perf report --sort comm,dso)
#

After applying patch:

ravi@ravi-bangoria:~$ ./perf kvm record -a
[ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]

ravi@ravi-bangoria:~$ ./perf kvm report --stdio
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 17  of event 'cycles'
# Event count (approx.): 700746
#
# Overhead  Command  Shared Object Symbol
#   ...    ..
#
34.19%  :5758[unknown] [g] 0x818682ab
22.79%  :5758[unknown] [g] 0x812dc7f8
22.79%  :5758[unknown] [g] 0x818650d0
14.83%  :5758[unknown] [g] 0x8161a1b6
 2.49%  :5758[unknown] [g] 0x818692bf
 0.48%  :5758[unknown] [g] 0x81869253
 0.05%  :5758[unknown] [g] 0x81869250

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/session.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c35ffdd..468de95 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -972,7 +972,7 @@ static struct machine *machines__find_for_cpumode(struct 
machines *machines,
 
machine = machines__find(machines, pid);
if (!machine)
-   machine = machines__find(machines, 
DEFAULT_GUEST_KERNEL_ID);
+   machine = machines__findnew(machines, 
DEFAULT_GUEST_KERNEL_ID);
return machine;
}
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf kvm record/report: 'unprocessable sample' error while recording/reporting guest data

2015-12-16 Thread Ravi Bangoria

Hi acme,

I sent this patch few days ago. Unfortunately nobody has payed attention.

Can you please pick this up.

Regards,
Ravi

On Monday 07 December 2015 12:25 PM, Ravi Bangoria wrote:

While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.

Reason behind this failure is, when it tries to fetch machine from rb_tree
of machines, it fails. As a part of tracing a bug, we figured out that this
code was incorrectly refactored in commit 
54245fdc357613633954bfd38cffb71cb9def067
("perf session: Remove wrappers to machines__find")

This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.

This patch is generated from acme perf/core branch.

Below I've mention an example that demonstrate the behaviour before and
after applying patch.

Before applying patch:
[Note: One needs to run guest before recording data in host]

ravi@ravi-bangoria:~$ ./perf kvm record -a
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
[ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]

ravi@ravi-bangoria:~$ ./perf kvm report --stdio
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 285  of event 'cycles'
# Event count (approx.): 88715406
#
# Overhead  Command  Shared Object  Symbol
#   ...  .  ..
#

#
# (For a higher level overview, try: perf report --sort comm,dso)
#

After applying patch:

ravi@ravi-bangoria:~$ ./perf kvm record -a
[ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]

ravi@ravi-bangoria:~$ ./perf kvm report --stdio
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 17  of event 'cycles'
# Event count (approx.): 700746
#
# Overhead  Command  Shared Object Symbol
#   ...    ..
#
 34.19%  :5758[unknown] [g] 0x818682ab
 22.79%  :5758[unknown] [g] 0x812dc7f8
 22.79%  :5758[unknown] [g] 0x818650d0
 14.83%  :5758[unknown] [g] 0x8161a1b6
  2.49%  :5758[unknown] [g] 0x818692bf
  0.48%  :5758[unknown] [g] 0x81869253
  0.05%  :5758[unknown] [g] 0x81869250

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
  tools/perf/util/session.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c35ffdd..468de95 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -972,7 +972,7 @@ static struct machine *machines__find_for_cpumode(struct 
machines *machines,

machine = machines__find(machines, pid);
if (!machine)
-   machine = machines__find(machines, 
DEFAULT_GUEST_KERNEL_ID);
+   machine = machines__findnew(machines, 
DEFAULT_GUEST_KERNEL_ID);
return machine;
}



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf/kvm: Guest Symbol Resolution for powerpc

2015-12-29 Thread Ravi Bangoria
 0.00%  :9688[guest.kernel.kallsyms]  [g] .plpar_hcall
 0.00%  :9689[guest.kernel.kallsyms]  [g] .__srcu_read_unlock
 0.00%  :9689[guest.kernel.kallsyms]  [g] ._raw_spin_lock
 0.00%  :9689[guest.kernel.kallsyms]  [g] .arch_local_irq_restore

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/Build  |  2 +
 tools/perf/arch/powerpc/util/evlist.c   | 22 ++
 tools/perf/arch/powerpc/util/parse-tp.c | 74 +
 tools/perf/builtin-annotate.c   |  3 +-
 tools/perf/builtin-diff.c   |  3 +-
 tools/perf/builtin-mem.c| 10 +++--
 tools/perf/builtin-report.c |  3 +-
 tools/perf/builtin-script.c |  3 +-
 tools/perf/builtin-timechart.c  |  8 ++--
 tools/perf/builtin-top.c|  3 +-
 tools/perf/tests/hists_cumulate.c   |  2 +-
 tools/perf/tests/hists_filter.c |  2 +-
 tools/perf/tests/hists_link.c   |  4 +-
 tools/perf/tests/hists_output.c |  2 +-
 tools/perf/util/event.c | 15 +--
 tools/perf/util/event.h |  3 +-
 tools/perf/util/evlist.c|  8 
 tools/perf/util/evlist.h|  1 +
 tools/perf/util/evsel.c |  7 
 tools/perf/util/evsel.h |  3 ++
 tools/perf/util/session.c   |  9 ++--
 tools/perf/util/util.c  |  5 +++
 tools/perf/util/util.h  |  2 +
 23 files changed, 169 insertions(+), 25 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/evlist.c
 create mode 100644 tools/perf/arch/powerpc/util/parse-tp.c

diff --git a/tools/perf/arch/powerpc/util/Build 
b/tools/perf/arch/powerpc/util/Build
index 7b8b0d1..edd08e4 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -1,5 +1,7 @@
 libperf-y += header.o
 libperf-y += sym-handling.o
+libperf-y += parse-tp.o
+libperf-y += evlist.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
diff --git a/tools/perf/arch/powerpc/util/evlist.c 
b/tools/perf/arch/powerpc/util/evlist.c
new file mode 100644
index 000..6a16d72
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/evlist.c
@@ -0,0 +1,22 @@
+#include 
+#include "../../util/evsel.h"
+#include "../../util/evlist.h"
+
+/*
+ * To sample for only guest, record kvm_hv:kvm_guest_exit.
+ * Otherwise go via normal way(cycles).
+ */
+int perf_evlist__arch_add_default(struct perf_evlist *evlist)
+{
+   struct perf_evsel *evsel;
+
+   if (!perf_guest_only())
+   return -1;
+
+   evsel = perf_evsel__newtp_idx("kvm_hv", "kvm_guest_exit", 0);
+   if (IS_ERR(evsel))
+   return PTR_ERR(evsel);
+
+   perf_evlist__add(evlist, evsel);
+   return 0;
+}
diff --git a/tools/perf/arch/powerpc/util/parse-tp.c 
b/tools/perf/arch/powerpc/util/parse-tp.c
new file mode 100644
index 000..50c4ac8
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/parse-tp.c
@@ -0,0 +1,74 @@
+#include "../../util/evsel.h"
+#include "../../util/trace-event.h"
+#include "../../util/session.h"
+#include "../../util/util.h"
+
+#define KVMPPC_EXIT "kvm_hv:kvm_guest_exit"
+#define HV_DECREMENTER 2432
+#define HV_BIT 3
+#define PR_BIT 49
+#define PPC_MAX 63
+
+static bool is_kvmppc_exit_event(struct perf_evsel *evsel)
+{
+   static unsigned int kvmppc_exit;
+
+   if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
+   return false;
+
+   if (unlikely(kvmppc_exit == 0)) {
+   if (strcmp(KVMPPC_EXIT, evsel->name))
+   return false;
+   kvmppc_exit = evsel->attr.config;
+   } else if (kvmppc_exit != evsel->attr.config) {
+   return false;
+   }
+
+   return true;
+}
+
+static bool is_hv_dec_trap(struct perf_evsel *evsel, struct perf_sample 
*sample)
+{
+   int trap = perf_evsel__intval(evsel, sample, "trap");
+   return trap == HV_DECREMENTER;
+}
+
+/*
+ * Get the instruction pointer from the tracepoint data
+ */
+u64 arch__get_ip(struct perf_evsel *evsel, struct perf_sample *sample)
+{
+   if (perf_guest_only() &&
+   is_kvmppc_exit_event(evsel) &&
+   is_hv_dec_trap(evsel, sample))
+   return perf_evsel__intval(evsel, sample, "pc");
+
+   return sample->ip;
+}
+
+/*
+ * Get the HV and PR bits and accordingly, determine the cpumode
+ */
+u8 arch__get_cpumode(const union perf_event *event, struct perf_evsel *evsel,
+struct perf_sample *sample)
+{
+   unsigned long hv, pr, msr;
+   u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+
+   if (!perf_guest_only() || !is_kvmppc_exit_event(e

[RFC 4/6] perf annotate: generalize handling of ret instructions

2016-06-24 Thread Ravi Bangoria
From: "Naveen N. Rao" 

Introduce helper to detect ret instructions and use the same in the tui.
A helper is needed since some architectures such as powerpc have more
than one return instruction.

Signed-off-by: Naveen N. Rao 
---
 tools/perf/ui/browsers/annotate.c | 20 +---
 tools/perf/util/annotate.c| 10 ++
 tools/perf/util/annotate.h|  1 +
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index b65a979..288200f 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -223,16 +223,14 @@ static void annotate_browser__write(struct ui_browser 
*browser, void *entry, int
} else if (ins__is_call(dl->ins)) {
ui_browser__write_graph(browser, 
SLSMG_RARROW_CHAR);
SLsmg_write_char(' ');
+   } else if (ins__is_ret(dl->ins)) {
+   ui_browser__write_graph(browser, 
SLSMG_LARROW_CHAR);
+   SLsmg_write_char(' ');
} else {
ui_browser__write_nstring(browser, " ", 2);
}
} else {
-   if (strcmp(dl->name, "retq")) {
-   ui_browser__write_nstring(browser, " ", 2);
-   } else {
-   ui_browser__write_graph(browser, 
SLSMG_LARROW_CHAR);
-   SLsmg_write_char(' ');
-   }
+   ui_browser__write_nstring(browser, " ", 2);
}
 
disasm_line__scnprintf(dl, bf, sizeof(bf), 
!annotate_browser__opts.use_offset);
@@ -843,14 +841,14 @@ show_help:
ui_helpline__puts("Huh? No selection. Report to 
linux-kernel@vger.kernel.org");
else if (browser->selection->offset == -1)
ui_helpline__puts("Actions are only available 
for assembly lines.");
-   else if (!browser->selection->ins) {
-   if (strcmp(browser->selection->name, "retq"))
-   goto show_sup_ins;
+   else if (!browser->selection->ins)
+   goto show_sup_ins;
+   else if (ins__is_ret(browser->selection->ins))
goto out;
-   } else if (!(annotate_browser__jump(browser) ||
+   else if (!(annotate_browser__jump(browser) ||
 annotate_browser__callq(browser, evsel, 
hbt))) {
 show_sup_ins:
-   ui_helpline__puts("Actions are only available 
for 'callq', 'retq' & jump instructions.");
+   ui_helpline__puts("Actions are only available 
for function call/return & jump/branch instructions.");
}
continue;
case 't':
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e0dc7b2..634daf5 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -363,6 +363,15 @@ static struct ins_ops nop_ops = {
.scnprintf = nop__scnprintf,
 };
 
+static struct ins_ops ret_ops = {
+   .scnprintf = ins__raw_scnprintf,
+};
+
+bool ins__is_ret(const struct ins *ins)
+{
+   return ins->ops == _ops;
+}
+
 static struct ins instructions_x86[] = {
{ .name = "add",   .ops  = _ops, },
{ .name = "addl",  .ops  = _ops, },
@@ -439,6 +448,7 @@ static struct ins instructions_x86[] = {
{ .name = "xadd",  .ops  = _ops, },
{ .name = "xbeginl", .ops  = _ops, },
{ .name = "xbeginq", .ops  = _ops, },
+   { .name = "retq",  .ops  = _ops, },
 };
 
 static struct ins instructions_arm[] = {
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index f7b669e..488c427 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -48,6 +48,7 @@ struct ins {
 
 bool ins__is_jump(const struct ins *ins);
 bool ins__is_call(const struct ins *ins);
+bool ins__is_ret(const struct ins *ins);
 int ins__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands 
*ops);
 
 struct annotation;
-- 
2.5.5



[RFC 5/6] perf annotate: add powerpc support

2016-06-24 Thread Ravi Bangoria
From: "Naveen N. Rao" <naveen.n@linux.vnet.ibm.com>

Powerpc has long list of branch instructions and hardcoding them in table
appears to be error-prone. So, add new function to find instruction
instead of creating table.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/annotate.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 634daf5..ad01825 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -492,6 +492,68 @@ static struct ins *ins__find(const char *name)
   sizeof(struct ins), ins__key_cmp);
 }
 
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   return NULL;
+
+   if (name[0] == 'b') {
+   /* branch instructions */
+   ins->ops = _ops;
+
+   /*
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (!strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;
+
+   i = strlen(name) - 1;
+   if (i < 0)
+   return NULL;
+
+   /* ignore optional hints at the end of the instructions */
+   if (name[i] == '+' || name[i] == '-')
+   i--;
+
+   if (name[i] == 'l' || (name[i] == 'a' && name[i-1] == 'l')) {
+   /*
+* if the instruction ends up with 'l' or 'la', then
+* those are considered 'calls' since they update LR.
+* ... except for 'bnl' which is branch if not less than
+* and the absolute form of the same.
+*/
+   if (strcmp(name, "bnl") && strcmp(name, "bnl+") &&
+   strcmp(name, "bnl-") && strcmp(name, "bnla") &&
+   strcmp(name, "bnla+") && strcmp(name, "bnla-"))
+   ins->ops = _ops;
+   }
+   if (name[i] == 'r' && name[i-1] == 'l')
+   /*
+* instructions ending with 'lr' are considered to be
+* return instructions
+*/
+   ins->ops = _ops;
+
+   return ins;
+   }
+   return NULL;
+}
+
 static void __init_arch_ins(const char *arch, struct ins *instructions,
int size, struct ins *(*func)(const char *))
 {
@@ -513,6 +575,8 @@ static int _init_arch_ins(const char *norm_arch)
__init_arch_ins(norm_arch, instructions_arm,
ARRAY_SIZE(instructions_arm),
ins__find);
+   else if (!strcmp(norm_arch, PERF_ARCH_POWERPC))
+   __init_arch_ins(norm_arch, NULL, 0, ins__find_powerpc);
else
return -1;
 
-- 
2.5.5



[RFC 6/6] perf: add more triplets

2016-06-24 Thread Ravi Bangoria
Add few more triplets based on Fedora and Ubuntu binutils(cross tools).

Before applying patch on x86:
  ( Install binutils-powerpc64-linux-gnu.x86_64 )
  $ perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc \
  --objdump powerpc64-linux-gnu-objdump

After applying patch on x86:
  $ perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/common.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index 7da6ac7..93afa60 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -9,34 +9,44 @@ const char *const arm_triplets[] = {
"arm-unknown-linux-",
"arm-unknown-linux-gnu-",
"arm-unknown-linux-gnueabi-",
+   "arm-linux-gnu-",
+   "arm-linux-gnueabihf-",
+   "arm-none-eabi-",
NULL
 };
 
 const char *const arm64_triplets[] = {
"aarch64-linux-android-",
+   "aarch64-linux-gnu-",
NULL
 };
 
 const char *const powerpc_triplets[] = {
"powerpc-unknown-linux-gnu-",
"powerpc64-unknown-linux-gnu-",
+   "powerpc64-linux-gnu-",
+   "powerpc64le-linux-gnu-",
NULL
 };
 
 const char *const s390_triplets[] = {
"s390-ibm-linux-",
+   "s390x-linux-gnu-",
NULL
 };
 
 const char *const sh_triplets[] = {
"sh-unknown-linux-gnu-",
"sh64-unknown-linux-gnu-",
+   "sh-linux-gnu-",
+   "sh64-linux-gnu-",
NULL
 };
 
 const char *const sparc_triplets[] = {
"sparc-unknown-linux-gnu-",
"sparc64-unknown-linux-gnu-",
+   "sparc64-linux-gnu-",
NULL
 };
 
@@ -49,12 +59,19 @@ const char *const x86_triplets[] = {
"i386-pc-linux-gnu-",
"i686-linux-android-",
"i686-android-linux-",
+   "x86_64-linux-gnu-",
+   "i586-linux-gnu-",
NULL
 };
 
 const char *const mips_triplets[] = {
"mips-unknown-linux-gnu-",
"mipsel-linux-android-",
+   "mips-linux-gnu-",
+   "mips64-linux-gnu-",
+   "mips64el-linux-gnuabi64-",
+   "mips64-linux-gnuabi64-",
+   "mipsel-linux-gnu-",
NULL
 };
 
-- 
2.5.5



[RFC 3/6] perf annotate: Enable cross arch annotate

2016-06-24 Thread Ravi Bangoria
Change current data structures and function to enable cross arch annotate
and add support for x86 and arm instructions.

Current implementation does not contain logic of recording on one arch
and annotating on other. This remote annotate is partially possible with
current implementation for x86 (or may be arm as well) only. But, to make
remote annotation work properly, all architecture instruction tables need
to be included in the perf binary. And while annotating, look for
instruction table where perf.data was recorded.

For arm, few instructions were defined under #if __arm__ which I've used
as a table for arm. But I'm not sure whether instruction defined outside
of that also contains arm instructions.

Note:
Here arch_ins is global var. And init_arch_ins will be called every
time when we annotate symbol. So I still need to optimize this.
May be make arch_ins per session. Please suggest best way to do it.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/builtin-top.c  |   2 +-
 tools/perf/ui/browsers/annotate.c |   5 +-
 tools/perf/ui/gtk/annotate.c  |   6 +-
 tools/perf/util/annotate.c| 116 +-
 tools/perf/util/annotate.h|   3 +-
 tools/perf/util/evsel.c   |   7 +++
 tools/perf/util/evsel.h   |   2 +
 7 files changed, 108 insertions(+), 33 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 07fc792..d4fd947 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -128,7 +128,7 @@ static int perf_top__parse_source(struct perf_top *top, 
struct hist_entry *he)
return err;
}
 
-   err = symbol__annotate(sym, map, 0);
+   err = symbol__annotate(sym, map, 0, NULL);
if (err == 0) {
 out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 0e106bb..b65a979 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1031,6 +1031,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map 
*map,
int ret = -1;
int nr_pcnt = 1;
size_t sizeof_bdl = sizeof(struct browser_disasm_line);
+   char *target_arch = NULL;
 
if (sym == NULL)
return -1;
@@ -1052,7 +1053,9 @@ int symbol__tui_annotate(struct symbol *sym, struct map 
*map,
  (nr_pcnt - 1);
}
 
-   if (symbol__annotate(sym, map, sizeof_bdl) < 0) {
+   target_arch = perf_evsel__env_arch(evsel);
+
+   if (symbol__annotate(sym, map, sizeof_bdl, target_arch) < 0) {
ui__error("%s", ui_helpline__last_msg);
goto out_free_offsets;
}
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 9c7ff8d..e468c1a 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -4,7 +4,6 @@
 #include "util/evsel.h"
 #include "ui/helpline.h"
 
-
 enum {
ANN_COL__PERCENT,
ANN_COL__OFFSET,
@@ -162,11 +161,14 @@ static int symbol__gtk_annotate(struct symbol *sym, 
struct map *map,
GtkWidget *notebook;
GtkWidget *scrolled_window;
GtkWidget *tab_label;
+   char *target_arch = NULL;
 
if (map->dso->annotate_warned)
return -1;
 
-   if (symbol__annotate(sym, map, 0) < 0) {
+   target_arch = perf_evsel__env_arch(evsel);
+
+   if (symbol__annotate(sym, map, 0, target_arch) < 0) {
ui__error("%s", ui_helpline__current);
return -1;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index b2c7ae4..e0dc7b2 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include "../arch/common.h"
 
 const char *disassembler_style;
 const char *objdump_path;
@@ -28,6 +30,13 @@ static regex_tfile_lineno;
 static struct ins *ins__find(const char *name);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
 
+static struct arch_instructions {
+   const char *arch;
+   intnmemb;
+   struct ins *instructions;
+   struct ins *(*ins__find)(const char *);
+} arch_ins;
+
 static void ins__delete(struct ins_operands *ops)
 {
if (ops == NULL)
@@ -183,7 +192,7 @@ static int lock__parse(struct ins_operands *ops)
if (disasm_line__parse(ops->raw, , >locked.ops->raw) < 0)
goto out_free_ops;
 
-   ops->locked.ins = ins__find(name);
+   ops->locked.ins = arch_ins.ins__find(name);
free(name);
 
if (ops->locked.ins == NULL)
@@ -354,26 +363,12 @@ static struct ins_ops nop_ops = {
.scnprintf = nop__scnprintf,
 };
 
-static struct ins instructions[] = {
+static struct ins instructions_x86[] = {
{ .name = &qu

[RFC 1/6] perf: Remove unused hist_entry__annotate function

2016-06-24 Thread Ravi Bangoria
hist_entry__annotate looks part of API but I don't find any caller
of this function. Removing it.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/annotate.c | 5 -
 tools/perf/util/annotate.h | 2 --
 2 files changed, 7 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7e5a1e8..b2c7ae4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1676,11 +1676,6 @@ int symbol__tty_annotate(struct symbol *sym, struct map 
*map,
return 0;
 }
 
-int hist_entry__annotate(struct hist_entry *he, size_t privsize)
-{
-   return symbol__annotate(he->ms.sym, he->ms.map, privsize);
-}
-
 bool ui__has_annotation(void)
 {
return use_browser == 1 && perf_hpp_list.sym;
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 9241f8c..82f3781 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -156,8 +156,6 @@ void symbol__annotate_zero_histograms(struct symbol *sym);
 
 int symbol__annotate(struct symbol *sym, struct map *map, size_t privsize);
 
-int hist_entry__annotate(struct hist_entry *he, size_t privsize);
-
 int symbol__annotate_init(struct map *map, struct symbol *sym);
 int symbol__annotate_printf(struct symbol *sym, struct map *map,
struct perf_evsel *evsel, bool full_paths,
-- 
2.5.5



[RFC 2/6] perf annotate: Define macro for arch names

2016-06-24 Thread Ravi Bangoria
Define macro for each arch name and use them instead of using arch
name as string.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/common.c   | 36 ++--
 tools/perf/arch/common.h   | 10 ++
 tools/perf/util/unwind-libunwind.c |  5 +++--
 3 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index fa090a9..7da6ac7 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -105,25 +105,25 @@ static int lookup_triplets(const char *const *triplets, 
const char *name)
 const char *normalize_arch(char *arch)
 {
if (!strcmp(arch, "x86_64"))
-   return "x86";
+   return PERF_ARCH_X86;
if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-   return "x86";
+   return PERF_ARCH_X86;
if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-   return "sparc";
+   return PERF_ARCH_SPARC;
if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
-   return "arm64";
+   return PERF_ARCH_ARM64;
if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-   return "arm";
+   return PERF_ARCH_ARM;
if (!strncmp(arch, "s390", 4))
-   return "s390";
+   return PERF_ARCH_S390;
if (!strncmp(arch, "parisc", 6))
-   return "parisc";
+   return PERF_ARCH_PARISC;
if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-   return "powerpc";
+   return PERF_ARCH_POWERPC;
if (!strncmp(arch, "mips", 4))
-   return "mips";
+   return PERF_ARCH_MIPS;
if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-   return "sh";
+   return PERF_ARCH_SH;
 
return arch;
 }
@@ -163,21 +163,21 @@ static int perf_env__lookup_binutils_path(struct perf_env 
*env,
zfree();
}
 
-   if (!strcmp(arch, "arm"))
+   if (!strcmp(arch, PERF_ARCH_ARM))
path_list = arm_triplets;
-   else if (!strcmp(arch, "arm64"))
+   else if (!strcmp(arch, PERF_ARCH_ARM64))
path_list = arm64_triplets;
-   else if (!strcmp(arch, "powerpc"))
+   else if (!strcmp(arch, PERF_ARCH_POWERPC))
path_list = powerpc_triplets;
-   else if (!strcmp(arch, "sh"))
+   else if (!strcmp(arch, PERF_ARCH_SH))
path_list = sh_triplets;
-   else if (!strcmp(arch, "s390"))
+   else if (!strcmp(arch, PERF_ARCH_S390))
path_list = s390_triplets;
-   else if (!strcmp(arch, "sparc"))
+   else if (!strcmp(arch, PERF_ARCH_SPARC))
path_list = sparc_triplets;
-   else if (!strcmp(arch, "x86"))
+   else if (!strcmp(arch, PERF_ARCH_X86))
path_list = x86_triplets;
-   else if (!strcmp(arch, "mips"))
+   else if (!strcmp(arch, PERF_ARCH_MIPS))
path_list = mips_triplets;
else {
ui__error("binutils for %s not supported.\n", arch);
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index 6b01c73..bbb6960 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -5,6 +5,16 @@
 
 extern const char *objdump_path;
 
+#define PERF_ARCH_X86  "x86"
+#define PERF_ARCH_SPARC"sparc"
+#define PERF_ARCH_ARM64"arm64"
+#define PERF_ARCH_ARM  "arm"
+#define PERF_ARCH_S390 "s390"
+#define PERF_ARCH_PARISC   "parisc"
+#define PERF_ARCH_POWERPC  "powerpc"
+#define PERF_ARCH_MIPS "mips"
+#define PERF_ARCH_SH   "sh"
+
 int perf_env__lookup_objdump(struct perf_env *env);
 const char *normalize_arch(char *arch);
 
diff --git a/tools/perf/util/unwind-libunwind.c 
b/tools/perf/util/unwind-libunwind.c
index 8547119..ccebe5e 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -36,10 +36,11 @@ int unwind__prepare_access(struct thread *thread, struct 
map *map)
 
arch = normalize_arch(thread->mg->machine->env->arch);
 
-   if (!strcmp(arch, "x86")) {
+   if (!strcmp(arch, PERF_ARCH_X86)) {
if (dso_type != DSO__TYPE_64BIT)
ops = x86_32_unwind_libunwind_ops;
-   } else if (!strcmp(arch, "arm64") || !strcmp(arch, "arm")) {
+   } else if (!strcmp(arch, PERF_ARCH_ARM64) ||
+  !strcmp(arch, PERF_ARCH_ARM)) {
if (dso_type == DSO__TYPE_64BIT)
ops = arm64_unwind_libunwind_ops;
}
-- 
2.5.5



[RFC 0/6] perf annotate: Enable cross arch annotate

2016-06-24 Thread Ravi Bangoria
Perf can currently only support code navigation (branches and calls) in
annotate when run on the same architecture where perf.data was recorded.
But cross arch annotate is not supported.

This patchset enables cross arch annotate. Currently I've used x86
and arm instructions which are already available and adding support
for powerpc as well. Adding support for other arch will be easy.

I've created this patch on top of acme/perf/core. And tested it with
x86 and powerpc only.

Example:

  Record on powerpc:
  $ ./perf record -a

  Report -> Annotate on x86:
  $ ./perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Naveen N. Rao (2):
  perf annotate: generalize handling of ret instructions
  perf annotate: add powerpc support

Ravi Bangoria (4):
  perf: Remove unused hist_entry__annotate function
  perf annotate: Define macro for arch names
  perf annotate: Enable cross arch annotate
  perf: add more triplets

 tools/perf/arch/common.c   |  53 ++
 tools/perf/arch/common.h   |  10 ++
 tools/perf/builtin-top.c   |   2 +-
 tools/perf/ui/browsers/annotate.c  |  25 ++---
 tools/perf/ui/gtk/annotate.c   |   6 +-
 tools/perf/util/annotate.c | 195 ++---
 tools/perf/util/annotate.h |   6 +-
 tools/perf/util/evsel.c|   7 ++
 tools/perf/util/evsel.h|   2 +
 tools/perf/util/unwind-libunwind.c |   5 +-
 10 files changed, 240 insertions(+), 71 deletions(-)

--
2.5.5



[PATCH 2/4] perf annotate: Enable cross arch annotate

2016-06-28 Thread Ravi Bangoria
Change current data structures and function to enable cross arch
annotate.

Current implementation does not contain logic of record on one arch
and annotate on other. This remote annotate is partially possible
with current implementation for x86 (or may be arm as well) only.
But, to make remote annotation work properly, all architecture
instruction tables need to be included in the perf binary. And while
annotating, look for instruction table where perf.data was recorded.

For arm, few instructions were defined under #if __arm__ which I've
used as a table for arm. But I'm not sure whether instruction defined
outside of that also contains arm instructions. Apart from that,
'call__parse()' and 'move__parse()' contains #ifdef __arm__ directive.
I've changed it to  if (!strcmp(norm_arch, "arm")). But I've not
tested this as well.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/builtin-top.c  |   2 +-
 tools/perf/ui/browsers/annotate.c |   3 +-
 tools/perf/ui/gtk/annotate.c  |   2 +-
 tools/perf/util/annotate.c| 136 --
 tools/perf/util/annotate.h|   5 +-
 5 files changed, 95 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 07fc792..d4fd947 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -128,7 +128,7 @@ static int perf_top__parse_source(struct perf_top *top, 
struct hist_entry *he)
return err;
}
 
-   err = symbol__annotate(sym, map, 0);
+   err = symbol__annotate(sym, map, 0, NULL);
if (err == 0) {
 out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 29dc6d2..3a652a6f 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1050,7 +1050,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map 
*map,
  (nr_pcnt - 1);
}
 
-   if (symbol__annotate(sym, map, sizeof_bdl) < 0) {
+   if (symbol__annotate(sym, map, sizeof_bdl,
+perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__last_msg);
goto out_free_offsets;
}
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 9c7ff8d..d7150b3 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -166,7 +166,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct 
map *map,
if (map->dso->annotate_warned)
return -1;
 
-   if (symbol__annotate(sym, map, 0) < 0) {
+   if (symbol__annotate(sym, map, 0, perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__current);
return -1;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index c385fec..36a5825 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -20,12 +20,14 @@
 #include 
 #include 
 #include 
+#include 
+#include "../arch/common.h"
 
 const char *disassembler_style;
 const char *objdump_path;
 static regex_t  file_lineno;
 
-static struct ins *ins__find(const char *name);
+static struct ins *ins__find(const char *name, const char *norm_arch);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
 
 static void ins__delete(struct ins_operands *ops)
@@ -53,7 +55,8 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
 }
 
-static int call__parse(struct ins_operands *ops)
+static int call__parse(struct ins_operands *ops,
+  __maybe_unused const char *norm_arch)
 {
char *endptr, *tok, *name;
 
@@ -65,10 +68,8 @@ static int call__parse(struct ins_operands *ops)
 
name++;
 
-#ifdef __arm__
-   if (strchr(name, '+'))
+   if (!strcmp(norm_arch, "arm") && strchr(name, '+'))
return -1;
-#endif
 
tok = strchr(name, '>');
if (tok == NULL)
@@ -117,7 +118,8 @@ bool ins__is_call(const struct ins *ins)
return ins->ops == _ops;
 }
 
-static int jump__parse(struct ins_operands *ops)
+static int jump__parse(struct ins_operands *ops,
+  __maybe_unused const char *norm_arch)
 {
const char *s = strchr(ops->raw, '+');
 
@@ -172,7 +174,7 @@ static int comment__symbol(char *raw, char *comment, u64 
*addrp, char **namep)
return 0;
 }
 
-static int lock__parse(struct ins_operands *ops)
+static int lock__parse(struct ins_operands *ops, const char *norm_arch)
 {
char *name;
 
@@ -183,7 +185,7 @@ static int lock__parse(struct ins_operands *ops)
if (disasm_line__parse(ops->raw, , >locked.ops->raw) < 0)
goto out_free_ops;
 
-   ops->locked.ins = ins__find(name);
+   ops-&

[PATCH 1/4] perf: Utility function to fetch arch from evsel

2016-06-28 Thread Ravi Bangoria
Add Utility function to fetch 'arch' from 'evsel'. (evsel->env->arch)

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/evsel.c | 7 +++
 tools/perf/util/evsel.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1d8f2bb..0fea724 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2422,3 +2422,10 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
 err, strerror_r(err, sbuf, sizeof(sbuf)),
 perf_evsel__name(evsel));
 }
+
+char *perf_evsel__env_arch(struct perf_evsel *evsel)
+{
+   if (evsel && evsel->evlist && evsel->evlist->env)
+   return evsel->evlist->env->arch;
+   return NULL;
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 828ddd1..86fed7a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -435,4 +435,6 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const 
char *, void *);
 int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 attr__fprintf_f attr__fprintf, void *priv);
 
+char *perf_evsel__env_arch(struct perf_evsel *evsel);
+
 #endif /* __PERF_EVSEL_H */
-- 
2.5.5



[PATCH 4/4] perf annotate: Define macro for arch names

2016-06-28 Thread Ravi Bangoria
Define macro for each arch name and use them instead of using arch
name as string.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/common.c   | 36 ++--
 tools/perf/arch/common.h   | 11 +++
 tools/perf/util/annotate.c | 10 +-
 tools/perf/util/unwind-libunwind.c |  4 ++--
 4 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index ee69668..feb2113 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -122,25 +122,25 @@ static int lookup_triplets(const char *const *triplets, 
const char *name)
 const char *normalize_arch(char *arch)
 {
if (!strcmp(arch, "x86_64"))
-   return "x86";
+   return NORM_X86;
if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-   return "x86";
+   return NORM_X86;
if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-   return "sparc";
+   return NORM_SPARC;
if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
-   return "arm64";
+   return NORM_ARM64;
if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-   return "arm";
+   return NORM_ARM;
if (!strncmp(arch, "s390", 4))
-   return "s390";
+   return NORM_S390;
if (!strncmp(arch, "parisc", 6))
-   return "parisc";
+   return NORM_PARISC;
if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-   return "powerpc";
+   return NORM_POWERPC;
if (!strncmp(arch, "mips", 4))
-   return "mips";
+   return NORM_MIPS;
if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-   return "sh";
+   return NORM_SH;
 
return arch;
 }
@@ -180,21 +180,21 @@ static int perf_env__lookup_binutils_path(struct perf_env 
*env,
zfree();
}
 
-   if (!strcmp(arch, "arm"))
+   if (!strcmp(arch, NORM_ARM))
path_list = arm_triplets;
-   else if (!strcmp(arch, "arm64"))
+   else if (!strcmp(arch, NORM_ARM64))
path_list = arm64_triplets;
-   else if (!strcmp(arch, "powerpc"))
+   else if (!strcmp(arch, NORM_POWERPC))
path_list = powerpc_triplets;
-   else if (!strcmp(arch, "sh"))
+   else if (!strcmp(arch, NORM_SH))
path_list = sh_triplets;
-   else if (!strcmp(arch, "s390"))
+   else if (!strcmp(arch, NORM_S390))
path_list = s390_triplets;
-   else if (!strcmp(arch, "sparc"))
+   else if (!strcmp(arch, NORM_SPARC))
path_list = sparc_triplets;
-   else if (!strcmp(arch, "x86"))
+   else if (!strcmp(arch, NORM_X86))
path_list = x86_triplets;
-   else if (!strcmp(arch, "mips"))
+   else if (!strcmp(arch, NORM_MIPS))
path_list = mips_triplets;
else {
ui__error("binutils for %s not supported.\n", arch);
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index 6b01c73..14ca8ca 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -5,6 +5,17 @@
 
 extern const char *objdump_path;
 
+/* Macro for normalized arch names */
+#define NORM_X86   "x86"
+#define NORM_SPARC "sparc"
+#define NORM_ARM64 "arm64"
+#define NORM_ARM   "arm"
+#define NORM_S390  "s390"
+#define NORM_PARISC"parisc"
+#define NORM_POWERPC   "powerpc"
+#define NORM_MIPS  "mips"
+#define NORM_SH"sh"
+
 int perf_env__lookup_objdump(struct perf_env *env);
 const char *normalize_arch(char *arch);
 
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 96c6610..8146a25 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -68,7 +68,7 @@ static int call__parse(struct ins_operands *ops,
 
name++;
 
-   if (!strcmp(norm_arch, "arm") && strchr(name, '+'))
+   if (!strcmp(norm_arch, NORM_ARM) && strchr(name, '+'))
return -1;
 
tok = strchr(name, '>');
@@ -255,7 +255,7 @@ static int mov__parse(struct ins_operands *ops,
 
target = ++s;
 
-   if (!strcmp(norm_arch, "arm"))
+   if (!strcmp(norm_arch, NORM_ARM))
comment = strchr(s, ';');
else
comment = strchr(

Re: [RFC 3/6] perf annotate: Enable cross arch annotate

2016-06-28 Thread Ravi Bangoria



On Monday 27 June 2016 10:46 PM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jun 24, 2016 at 05:23:57PM +0530, Ravi Bangoria escreveu:

Change current data structures and function to enable cross arch annotate
and add support for x86 and arm instructions.

Current implementation does not contain logic of recording on one arch
and annotating on other. This remote annotate is partially possible with
current implementation for x86 (or may be arm as well) only. But, to make
remote annotation work properly, all architecture instruction tables need
to be included in the perf binary. And while annotating, look for
instruction table where perf.data was recorded.


...

  
+static struct arch_instructions {

+   const char *arch;
+   intnmemb;
+   struct ins *instructions;
+   struct ins *(*ins__find)(const char *);

Why do we need arch specific find functions? Why not pass the
instructions pointer to it, just like you did with ins__sort().

Probably it is not needed to be global, you just pick the right
instructions table + its ARRAY_SIZE and pass it around, again, like you
did in ins__sort().

- Arnaldo


Thanks Arnaldo for suggestion.

To determine arch in ins__find, I need to pass 'arch' till ins__find and 
which

requires changes in definition of many functions. So, I thought about global
var.

Anyway, I've prepared a patch as you suggested and sent it as a [PATCH].
Please review it.

-Ravi



[PATCH 3/4] perf annotate: add powerpc support

2016-06-28 Thread Ravi Bangoria
From: "Naveen N. Rao" <naveen.n@linux.vnet.ibm.com>

Powerpc has long list of branch instructions and hardcoding them in table
appears to be error-prone. So, add new function to find instruction
instead of creating table.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/annotate.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..96c6610 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -476,6 +476,68 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
 }
 
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   return NULL;
+
+   if (name[0] == 'b') {
+   /* branch instructions */
+   ins->ops = _ops;
+
+   /*
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (!strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;
+
+   i = strlen(name) - 1;
+   if (i < 0)
+   return NULL;
+
+   /* ignore optional hints at the end of the instructions */
+   if (name[i] == '+' || name[i] == '-')
+   i--;
+
+   if (name[i] == 'l' || (name[i] == 'a' && name[i-1] == 'l')) {
+   /*
+* if the instruction ends up with 'l' or 'la', then
+* those are considered 'calls' since they update LR.
+* ... except for 'bnl' which is branch if not less than
+* and the absolute form of the same.
+*/
+   if (strcmp(name, "bnl") && strcmp(name, "bnl+") &&
+   strcmp(name, "bnl-") && strcmp(name, "bnla") &&
+   strcmp(name, "bnla+") && strcmp(name, "bnla-"))
+   ins->ops = _ops;
+   }
+   if (name[i] == 'r' && name[i-1] == 'l')
+   /*
+* instructions ending with 'lr' are considered to be
+* return instructions
+*/
+   ins->ops = _ops;
+
+   return ins;
+   }
+   return NULL;
+}
+
 static void ins__sort(struct ins *instructions, int nmemb)
 {
qsort(instructions, nmemb, sizeof(struct ins), ins__cmp);
@@ -511,6 +573,8 @@ static struct ins *ins__find(const char *name, const char 
*norm_arch)
} else if (!strcmp(norm_arch, "arm")) {
instructions = instructions_arm;
nmemb = ARRAY_SIZE(instructions_arm);
+   } else if (!strcmp(norm_arch, "powerpc")) {
+   return ins__find_powerpc(name);
} else {
pr_err("perf annotate not supported by %s arch\n", norm_arch);
return NULL;
-- 
2.5.5



[PATCH 0/4] perf annotate: Enable cross arch annotate

2016-06-28 Thread Ravi Bangoria
Perf can currently only support code navigation (branches and calls) in
annotate when run on the same architecture where perf.data was recorded.
But cross arch annotate is not supported.

This patchset enables cross arch annotate. Currently I've used x86
and arm instructions which are already available and adding support
for powerpc as well. Adding support for other arch will be easy.

I've created this patch on top of acme/perf/core. And tested it with
x86 and powerpc only.

Example:

  Record on powerpc:
  $ ./perf record -a

  Report -> Annotate on x86:
  $ ./perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Changes in [PATCH] vs [RFC]
  - Removed global var 'arch__ins' and pass arch info till ins__find

Naveen N. Rao (1):
  perf annotate: add powerpc support

Ravi Bangoria (3):
  perf: Utility function to fetch arch
  perf annotate: Enable cross arch annotate
  perf annotate: Define macro for arch names

 tools/perf/arch/common.c   |  36 +++
 tools/perf/arch/common.h   |  11 +++
 tools/perf/builtin-top.c   |   2 +-
 tools/perf/ui/browsers/annotate.c  |   3 +-
 tools/perf/ui/gtk/annotate.c   |   2 +-
 tools/perf/util/annotate.c | 198 -
 tools/perf/util/annotate.h |   5 +-
 tools/perf/util/evsel.c|   7 ++
 tools/perf/util/evsel.h|   2 +
 tools/perf/util/unwind-libunwind.c |   4 +-
 10 files changed, 198 insertions(+), 72 deletions(-)

--
2.5.5



Re: [PATCH 3/4] perf annotate: add powerpc support

2016-06-29 Thread Ravi Bangoria

Thanks David.

On Tuesday 28 June 2016 09:37 PM, David Laight wrote:

From: Ravi Bangoria

Sent: 28 June 2016 12:37

Powerpc has long list of branch instructions and hardcoding them in table
appears to be error-prone. So, add new function to find instruction
instead of creating table.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
  tools/perf/util/annotate.c | 64 ++
  1 file changed, 64 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..96c6610 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -476,6 +476,68 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
  }

+static struct ins *ins__find_powerpc(const char *name)

It would be better if the function name include 'branch'.


+{
+   int i;
+   struct ins *ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   return NULL;

You leak 'ins' here.


+
+   if (name[0] == 'b') {
+   /* branch instructions */
+   ins->ops = _ops;
+
+   /*
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (!strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;

More importantly you leak 'ins' and 'ins->name' here.
And on other paths below.


Yes. Fair points.

I can create linked list that maintain allocated instructions and
lookup it every time before allocating memory. But for this,
I need to free memory at the end and it's becoming complicated.

I can go back to normal approach of creating table for powerpc.
This is simplest. But only problem is powerpc has around 400 branch
instructions(which includes call and ret as well). And list them all is
bit error-prone.

Suggestions?

- Ravi


...

David





Re: [PATCH v2 2/3] perf kvm: enable record|report feature on powerpc

2016-02-09 Thread Ravi Bangoria

Hi acme,

On Tuesday 02 February 2016 02:36 PM, Ravi Bangoria wrote:

HI acme,

On Tuesday 02 February 2016 02:36 AM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jan 22, 2016 at 11:28:11AM +0530, Ravi Bangoria escreveu:

+return event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+}

This hunk and the next should be on the previous patch, that is not even
compiling...

You have to compile patch by patch, we can't just test at the end of a
patchkit like this, this destroys bisection ;-\


Didn't aware about that. Will take care of compiling each patch
separately next time onwards.


Also you first need to put in place a way to override how to obtain the
cpumode, then you should use it.

Also this mode doesn't look feasible at all, think about processing
perf.data files generated in !powerpc systems being analysed in a
powerpc system. This has to be dependend on the architecture of the
machine where the perf.data file was recorded, not on the archictecture
of the machine the binary was built for.


Valid point.

I'll re-think about approach in this case.



I've analyzed the approach. Here is my observations:

1. With the current approach, record on !powerpc and report on powerpc
will work as we are solely dependent on tracepoint; so we don't change ip
and cpumode of sample if it's not of kvm_hv:kvm_guest_exit.

2. However, record on powerpc and report on !powerpc won't work with the
current approach. To enable that, we have two options:

Option A. Change ip and cpumode of sample at a time of record.
This will add overhead at a time of recording data and it may have
bad effect like data lost.

Option B. Extension to current approach (change ip and cpumode at
report time only).
I'll need to move 'most of' the code from arch/powerpc/util/kvm.c into
some common code which is included on all architectures. And use
this code to make decision about changing ip and cpumode of sample
at run time. So these functions needs to be present in a binary,
no matter which platform it's compiled on.

I want your suggestions here, how best we can achieve that?

Regards,
Ravi



Re: [PATCH v2 1/3] perf kvm: Introduce evsel as argument to perf_event__preprocess_sample

2016-02-02 Thread Ravi Bangoria

HI acme,

Thanks for reviewing the patch.

On Tuesday 02 February 2016 02:23 AM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jan 22, 2016 at 11:28:10AM +0530, Ravi Bangoria escreveu:

This patch changes prototype of perf_event__preprocess_sample() with
additional argument evsel added at last.

This change is required because perf_event__preprocess_sample()
function will use evsel to determine cpumode of samples for powerpc
architecture.

Signed-off-by: Ravi Bangoria<ravi.bango...@linux.vnet.ibm.com>

Fixing these problems:



   CC   /tmp/build/perf/ui/gtk/util.o
util/event.c: In function ‘perf_event__preprocess_sample’:
util/event.c:1302:26: error: unused parameter ‘evsel’ [-Werror=unused-parameter]
struct perf_evsel *evsel)
   ^
cc1: all warnings being treated as errors
mv: cannot stat ‘/tmp/build/perf/util/.event.o.tmp’: No such file or directory
/home/acme/git/linux/tools/build/Makefile.build:77: recipe for target 
'/tmp/build/perf/util/event.o' failed
make[3]: *** [/tmp/build/perf/util/event.o] Error 1
make[3]: *** Waiting for unfinished jobs
   CC   /tmp/build/perf/ui/gtk/helpline.o
   CC   /tmp/build/perf/arch/common.o
/home/acme/git/linux/tools/build/Makefile.build:116: recipe for target 'util' 
failed
make[2]: *** [util] Error 2
make[2]: *** Waiting for unfinished jobs


Thanks for pointing this out. Actually I was not aware about this.

Will take care next time onwards.

Regards,
Ravi



Re: [PATCH v2 2/3] perf kvm: enable record|report feature on powerpc

2016-02-02 Thread Ravi Bangoria

HI acme,

On Tuesday 02 February 2016 02:36 AM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jan 22, 2016 at 11:28:11AM +0530, Ravi Bangoria escreveu:

+   return event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+}

This hunk and the next should be on the previous patch, that is not even
compiling...

You have to compile patch by patch, we can't just test at the end of a
patchkit like this, this destroys bisection ;-\


Didn't aware about that. Will take care of compiling each patch
separately next time onwards.


Also you first need to put in place a way to override how to obtain the
cpumode, then you should use it.

Also this mode doesn't look feasible at all, think about processing
perf.data files generated in !powerpc systems being analysed in a
powerpc system. This has to be dependend on the architecture of the
machine where the perf.data file was recorded, not on the archictecture
of the machine the binary was built for.


Valid point.

I'll re-think about approach in this case.


It is only when you do live analysis, like with 'perf trace' and 'perf
top' that its guaranteed to be all on the same machine.

IIRC in one of the patches in this series you introduce and use a
library function on the same patch, please break it into two patches as
well, lemme see what is the name...

Yeah, it is also in this patch:

perf_evlist__arch_add_default(struct perf_evlist *evlist)

Please add this in a separate patch, stating in the changeset comment
why it is needed and how architectures can override it.


Will do that. Thanks for reviewing.

Regards,
Ravi



[RFC 0/4] perf kvm: Guest Symbol Resolution for powerpc

2016-02-24 Thread Ravi Bangoria
Design of [patch v2] Guest Symbol Resolution is focused on enabling
perf kvm {record|report} on powerpc. Here is the link for the same:
thread.gmane.org/gmane.linux.kernel/2132409

As per the point raised by acme, this design does not enable cross
arch reporting functionality. i.e. record on powerpc and report on
!powerpc.

This patch aims to enable cross arch reporting functionality along with
enabling perf kvm {record|report} on powerpc. Note that basic principle
of enabling perf kvm {record|report} on powerpc using tracepoint
kvm_hv:kvm_guest_exit has not been changed.

Major change between [patch v2] and this [RFC] patch is, I've moved
'perf kvm report' related and ppc specific functionality from
tool/perf/arch/powerpc/ to generic tool/perf/ code. This is required
because perf binary needs ppc specific code even if it's compiled on
!ppc to enable cross arch reporting.

I need suggestion specifically on patch 3 (Enable 'report' on powerpc)
which contains arch specific code in generic area. Right now I've added
code in util/evsel.c. But please let me know if there's any better way
to do this.

This patch is to get suggestions on approach so I've tagged it as RFC
and not following the patch version series.

Ravi Bangoria (4):
  perf kvm: Enable 'record' on powerpc
  perf kvm: Introduce evsel as argument to perf_event__preprocess_sample
  perf kvm: Enable 'report' on powerpc
  perf kvm: Fix output fields instead of 'trace' for perf kvm report on
powerpc

 tools/perf/arch/powerpc/util/Build |  1 +
 tools/perf/arch/powerpc/util/kvm.c | 18 +
 tools/perf/builtin-annotate.c  |  3 +-
 tools/perf/builtin-diff.c  |  3 +-
 tools/perf/builtin-mem.c   | 10 +++--
 tools/perf/builtin-report.c|  8 +++-
 tools/perf/builtin-script.c|  3 +-
 tools/perf/builtin-timechart.c |  8 ++--
 tools/perf/builtin-top.c   |  3 +-
 tools/perf/tests/hists_cumulate.c  |  2 +-
 tools/perf/tests/hists_filter.c|  2 +-
 tools/perf/tests/hists_link.c  |  4 +-
 tools/perf/tests/hists_output.c|  2 +-
 tools/perf/util/event.c|  8 ++--
 tools/perf/util/event.h|  3 +-
 tools/perf/util/evlist.c   |  9 +
 tools/perf/util/evlist.h   |  1 +
 tools/perf/util/evsel.c| 77 ++
 tools/perf/util/evsel.h|  7 
 tools/perf/util/session.c  |  7 ++--
 tools/perf/util/util.c |  5 +++
 tools/perf/util/util.h |  1 +
 22 files changed, 161 insertions(+), 24 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/kvm.c

--
2.1.4



[RFC 2/4] perf kvm: Introduce evsel as argument to perf_event__preprocess_sample

2016-02-24 Thread Ravi Bangoria
This patch changes prototype of perf_event__preprocess_sample() with
additional argument evsel added at the end.

This change is required because perf_event__preprocess_sample()
function will use evsel to determine cpumode of samples for powerpc
architecture.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/builtin-annotate.c |  3 ++-
 tools/perf/builtin-diff.c |  3 ++-
 tools/perf/builtin-mem.c  | 10 ++
 tools/perf/builtin-report.c   |  3 ++-
 tools/perf/builtin-script.c   |  3 ++-
 tools/perf/builtin-timechart.c|  8 +---
 tools/perf/builtin-top.c  |  3 ++-
 tools/perf/tests/hists_cumulate.c |  2 +-
 tools/perf/tests/hists_filter.c   |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/tests/hists_output.c   |  2 +-
 tools/perf/util/event.c   |  3 ++-
 tools/perf/util/event.h   |  3 ++-
 13 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index cfe3663..da330ae 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -94,7 +94,8 @@ static int process_sample_event(struct perf_tool *tool,
struct addr_location al;
int ret = 0;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
   event->header.type);
return -1;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 36ccc2b..d2a27fe 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -330,7 +330,8 @@ static int diff__process_sample_event(struct perf_tool 
*tool __maybe_unused,
struct hists *hists = evsel__hists(evsel);
int ret = -1;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
   event->header.type);
return -1;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index b3f8a89..a7c01fe 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -118,13 +118,15 @@ static int
 dump_raw_samples(struct perf_tool *tool,
 union perf_event *event,
 struct perf_sample *sample,
-struct machine *machine)
+struct machine *machine,
+struct perf_evsel *evsel)
 {
struct perf_mem *mem = container_of(tool, struct perf_mem, tool);
struct addr_location al;
const char *fmt;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@@ -168,10 +170,10 @@ out_put:
 static int process_sample_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
-   struct perf_evsel *evsel __maybe_unused,
+   struct perf_evsel *evsel,
struct machine *machine)
 {
-   return dump_raw_samples(tool, event, sample, machine);
+   return dump_raw_samples(tool, event, sample, machine, evsel);
 }
 
 static int report_raw_events(struct perf_mem *mem)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 760e886..31ec4ba 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -154,7 +154,8 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
 event->header.type);
return -1;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index f4caf48..792868e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -804,7 +804,8 @@ static int process_sample_event(struct perf_tool *tool,
return 0;
}
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+   

[RFC 3/4] perf kvm: Enable 'report' on powerpc

2016-02-24 Thread Ravi Bangoria
'perf kvm record' on powerpc will record kvm_hv:kvm_guest_exit event
instead of cycles. However, to have some kind of periodicity, we can't
use all the kvm exits, rather exits which are bound to happen in certain
intervals. HV_DECREMENTER Interrupt forces the threads to exit after an
interval of 10 ms.

This patch makes use of the 'kvm_guest_exit' tracepoint and checks the
exit reason for any kvm exit. If it is HV_DECREMENTER, then the
instruction pointer dumped along with this tracepoint is retrieved and
mapped with the guest kallsyms.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
---
 tools/perf/util/event.c   |  7 +++--
 tools/perf/util/evsel.c   | 77 +++
 tools/perf/util/evsel.h   |  7 +
 tools/perf/util/session.c |  7 +++--
 4 files changed, 92 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index bc0a3f0..31bbc50 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -1299,15 +1299,16 @@ int perf_event__preprocess_sample(const union 
perf_event *event,
  struct machine *machine,
  struct addr_location *al,
  struct perf_sample *sample,
- struct perf_evsel *evsel __maybe_unused)
+ struct perf_evsel *evsel)
 {
-   u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+   u8 cpumode;
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->tid);
-
if (thread == NULL)
return -1;
 
+   al->cpumode = cpumode = arch__get_cpumode(event, evsel, sample);
+
dump_printf(" ... thread: %s:%d\n", thread__comm_str(thread), 
thread->tid);
/*
 * Have we already created the kernel maps for this machine?
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0902fe4..a4d309e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1622,6 +1622,82 @@ static inline bool overflow(const void *endp, u16 
max_size, const void *offset,
 #define OVERFLOW_CHECK_u64(offset) \
OVERFLOW_CHECK(offset, sizeof(u64), sizeof(u64))
 
+#define KVMPPC_EXIT "kvm_hv:kvm_guest_exit"
+#define HV_DECREMENTER 2432
+#define HV_BIT 3
+#define PR_BIT 49
+#define PPC_MAX 63
+
+bool is_kvmppc_exit_event(struct perf_evsel *evsel)
+{
+   static unsigned int kvmppc_exit;
+
+   if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
+   return false;
+
+   if (unlikely(kvmppc_exit == 0)) {
+   if (strcmp(KVMPPC_EXIT, evsel->name))
+   return false;
+   kvmppc_exit = evsel->attr.config;
+   } else if (kvmppc_exit != evsel->attr.config) {
+   return false;
+   }
+
+   return true;
+}
+
+bool is_hv_dec_trap(struct perf_evsel *evsel, struct perf_sample *sample)
+{
+   int trap = perf_evsel__intval(evsel, sample, "trap");
+   return trap == HV_DECREMENTER;
+}
+
+bool is_perf_data_reorded_on_ppc(struct perf_evlist *evlist)
+{
+   if (evlist && evlist->env && evlist->env->arch)
+   return !strcmp(evlist->env->arch, "ppc64") ||
+   !strcmp(evlist->env->arch, "ppc64le");
+   return false;
+}
+
+u8 arch__get_cpumode(const union perf_event *event,
+struct perf_evsel *evsel,
+struct perf_sample *sample)
+{
+   unsigned long hv, pr, msr;
+   u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+
+   if (!(is_perf_data_reorded_on_ppc(evsel->evlist) &&
+ perf_guest_only() &&
+ is_kvmppc_exit_event(evsel)))
+   goto ret;
+
+   if (sample->raw_data && is_hv_dec_trap(evsel, sample)) {
+   msr = perf_evsel__intval(evsel, sample, "msr");
+   hv = msr & ((unsigned long)1 << (PPC_MAX - HV_BIT));
+   pr = msr & ((unsigned long)1 << (PPC_MAX - PR_BIT));
+
+   if (!hv && pr)
+   cpumode = PERF_RECORD_MISC_GUEST_USER;
+   else
+   cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+   }
+
+ret:
+   return cpumode;
+}
+
+u64 arch__get_ip(struct perf_evsel *evsel, struct perf_sample *sample)
+{
+   if (is_perf_data_reorded_on_ppc(evsel->evlist) &&
+   perf_guest_only() &&
+   is_kvmppc_exit_event(evsel) &&
+   is_hv_dec_trap(evsel, sample))
+   return perf_evsel__intval(evsel, sample, "pc");
+
+   return sample->ip;

[RFC 4/4] perf kvm: Fix output fields instead of 'trace' for perf kvm report on powerpc

2016-02-24 Thread Ravi Bangoria
commit d49dadea7862 ("perf tools: Make 'trace' or 'trace_fields' sort key
default for tracepoint events") makes 'trace' sort key as a default
while displaying report for tracepoint.

Because tracepoint(kvm_hv:kvm_guest_exit) is used as a default event,
perf kvm report will display output as a list of tracepoint hits and
not with a normal report columns.

This patch will replace 'trace' field with 'overhead,comm,dso,sym' while
displaying perf kvm report of powerpc.

Before applying patch:

  $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules 
report --stdio
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 181K of event 'kvm_hv:kvm_guest_exit'
  # Event count (approx.): 181061
  #
  # Overhead  Trace output
  #   
.
  #
   0.02%  VCPU 8: trap=HV_DECREMENTER pc=0xc0091924 
msr=0x80009032, ceded=0
   0.00%  VCPU 0: trap=HV_DECREMENTER pc=0xc0091924 
msr=0x80009032, ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x10005c7c msr=0x8280f032, 
ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x1001ef14 msr=0x8280f032, 
ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x3fff83398830 
msr=0x8280f032, ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x3fff833a6fe4 
msr=0x8280f032, ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x3fff833a7a64 
msr=0x8280f032, ceded=0

After applying patch:

  $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules 
report --stdio
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 181K of event 'kvm_hv:kvm_guest_exit'
  # Event count (approx.): 181061
  #
  # Overhead  Command  Shared ObjectSymbol
  #   ...  ...  ..
  #
   0.02%  :57276   [guest.kernel.kallsyms]  [g] .plpar_hcall_norets
   0.00%  :57274   [guest.kernel.kallsyms]  [g] .plpar_hcall_norets
   0.00%  :57276   [guest.kernel.kallsyms]  [g] .__copy_tofrom_user_power7
   0.00%  :57276   [guest.kernel.kallsyms]  [g] ._atomic_dec_and_lock
   0.00%  :57276   [guest.kernel.kallsyms]  [g] ._raw_spin_lock
   0.00%  :57276   [guest.kernel.kallsyms]  [g] ._switch
   0.00%  :57276   [guest.kernel.kallsyms]  [g] .bio_add_page
   0.00%  :57276   [guest.kernel.kallsyms]  [g] .kmem_cache_alloc

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/builtin-report.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 31ec4ba..5d96882 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -930,6 +930,11 @@ repeat:
else
use_browser = 0;
 
+   if (!field_order &&
+   is_perf_data_reorded_on_ppc(session->evlist) &&
+   perf_guest_only())
+   field_order = "overhead,comm,dso,sym";
+
if (setup_sorting(session->evlist) < 0) {
if (sort_order)
parse_options_usage(report_usage, options, "s", 1);
-- 
2.1.4



[RFC 1/4] perf kvm: Enable 'record' on powerpc

2016-02-24 Thread Ravi Bangoria
'perf kvm record' is not available on powerpc because 'perf' relies on
the 'cycles' event (a PMU event) to profile the guest. However, for
powerpc, this can't be used from the host because the PMUs are controlled
by the guest rather than the host.

There exists a tracepoint 'kvm_hv:kvm_guest_exit' in powerpc which is
hit whenever any of the threads exit the guest context. The guest
instruction pointer dumped along with this tracepoint data in the field
'pc', can be used as guest instruction pointer.

This patch changes default event as kvm_hv:kvm_guest_exit for recording
guest data in host on powerpc. As we are using host event to record guest
data, this approach will enable only --guest option of 'perf kvm'. Still
--host --guest together won't work.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/Build |  1 +
 tools/perf/arch/powerpc/util/kvm.c | 18 ++
 tools/perf/util/evlist.c   |  9 +
 tools/perf/util/evlist.h   |  1 +
 tools/perf/util/util.c |  5 +
 tools/perf/util/util.h |  1 +
 6 files changed, 35 insertions(+)
 create mode 100644 tools/perf/arch/powerpc/util/kvm.c

diff --git a/tools/perf/arch/powerpc/util/Build 
b/tools/perf/arch/powerpc/util/Build
index c8fe207..4cb620d 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -1,6 +1,7 @@
 libperf-y += header.o
 libperf-y += sym-handling.o
 libperf-y += kvm-stat.o
+libperf-y += kvm.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
diff --git a/tools/perf/arch/powerpc/util/kvm.c 
b/tools/perf/arch/powerpc/util/kvm.c
new file mode 100644
index 000..878d323
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/kvm.c
@@ -0,0 +1,18 @@
+#include 
+#include "../../../util/evsel.h"
+#include "../../../util/evlist.h"
+
+int perf_evlist__arch_add_default(struct perf_evlist *evlist)
+{
+   struct perf_evsel *evsel;
+
+   if (!perf_guest_only())
+   return -1;
+
+   evsel = perf_evsel__newtp_idx("kvm_hv", "kvm_guest_exit", 0);
+   if (IS_ERR(evsel))
+   return PTR_ERR(evsel);
+
+   perf_evlist__add(evlist, evsel);
+   return 0;
+}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c42e196..8b7b84f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -231,6 +231,12 @@ void perf_event_attr__set_max_precise_ip(struct 
perf_event_attr *attr)
}
 }
 
+int __weak
+perf_evlist__arch_add_default(struct perf_evlist *evlist __maybe_unused)
+{
+   return -1;
+}
+
 int perf_evlist__add_default(struct perf_evlist *evlist)
 {
struct perf_event_attr attr = {
@@ -239,6 +245,9 @@ int perf_evlist__add_default(struct perf_evlist *evlist)
};
struct perf_evsel *evsel;
 
+   if (!perf_evlist__arch_add_default(evlist))
+   return 0;
+
event_attr_init();
 
perf_event_attr__set_max_precise_ip();
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index a0d1522..7951406 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -75,6 +75,7 @@ void perf_evlist__delete(struct perf_evlist *evlist);
 
 void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry);
 void perf_evlist__remove(struct perf_evlist *evlist, struct perf_evsel *evsel);
+int perf_evlist__arch_add_default(struct perf_evlist *evlist);
 int perf_evlist__add_default(struct perf_evlist *evlist);
 int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 struct perf_event_attr *attrs, size_t 
nr_attrs);
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 35b20dd..567e3da 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -37,6 +37,11 @@ bool test_attr__enabled;
 bool perf_host  = true;
 bool perf_guest = false;
 
+bool perf_guest_only(void)
+{
+   return !perf_host && perf_guest;
+}
+
 void event_attr_init(struct perf_event_attr *attr)
 {
if (!perf_host)
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 3dd0408..c459c45 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -344,5 +344,6 @@ int fetch_kernel_version(unsigned int *puint,
 const char *perf_tip(const char *dirpath);
 bool is_regular_file(const char *file);
 int fetch_current_timestamp(char *buf, size_t sz);
+bool perf_guest_only(void);
 
 #endif /* GIT_COMPAT_UTIL_H */
-- 
2.1.4



[PATCH v2 1/3] perf kvm: Introduce evsel as argument to perf_event__preprocess_sample

2016-01-21 Thread Ravi Bangoria
This patch changes prototype of perf_event__preprocess_sample() with
additional argument evsel added at last.

This change is required because perf_event__preprocess_sample()
function will use evsel to determine cpumode of samples for powerpc
architecture.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
changes in v2:
- Breakdown of v1 patch into two sub patches

 tools/perf/builtin-annotate.c |  3 ++-
 tools/perf/builtin-diff.c |  3 ++-
 tools/perf/builtin-mem.c  | 10 ++
 tools/perf/builtin-report.c   |  3 ++-
 tools/perf/builtin-script.c   |  3 ++-
 tools/perf/builtin-timechart.c|  8 +---
 tools/perf/builtin-top.c  |  3 ++-
 tools/perf/tests/hists_cumulate.c |  2 +-
 tools/perf/tests/hists_filter.c   |  2 +-
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/tests/hists_output.c   |  2 +-
 tools/perf/util/event.c   |  3 ++-
 tools/perf/util/event.h   |  3 ++-
 13 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index cc5c126..b488a5c 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -94,7 +94,8 @@ static int process_sample_event(struct perf_tool *tool,
struct addr_location al;
int ret = 0;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
   event->header.type);
return -1;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 36ccc2b..d2a27fe 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -330,7 +330,8 @@ static int diff__process_sample_event(struct perf_tool 
*tool __maybe_unused,
struct hists *hists = evsel__hists(evsel);
int ret = -1;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
   event->header.type);
return -1;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 3901700..eb27b49 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -61,13 +61,15 @@ static int
 dump_raw_samples(struct perf_tool *tool,
 union perf_event *event,
 struct perf_sample *sample,
-struct machine *machine)
+struct machine *machine,
+struct perf_evsel *evsel)
 {
struct perf_mem *mem = container_of(tool, struct perf_mem, tool);
struct addr_location al;
const char *fmt;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@@ -111,10 +113,10 @@ out_put:
 static int process_sample_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
-   struct perf_evsel *evsel __maybe_unused,
+   struct perf_evsel *evsel,
struct machine *machine)
 {
-   return dump_raw_samples(tool, event, sample, machine);
+   return dump_raw_samples(tool, event, sample, machine, evsel);
 }
 
 static int report_raw_events(struct perf_mem *mem)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2bf537f..fa7bbd9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -151,7 +151,8 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_sample(event, machine, ,
+ sample, evsel) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
 event->header.type);
return -1;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c691214..4363e8a 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -783,7 +783,8 @@ static int process_sample_event(struct perf_tool *tool,
return 0;
}
 
-   if (perf_event__preprocess_sample(event, machine, , sample) < 0) {
+   if (perf_event__preprocess_

[PATCH v2 3/3] perf kvm: Fix output fields instead of 'trace' for perf kvm report on powerpc

2016-01-21 Thread Ravi Bangoria
commit d49dadea7862 ("perf tools: Make 'trace' or 'trace_fields' sort key
default for tracepoint events") makes 'trace' sort key as a default
while displaying report for tracepoint.

As tracepoint(kvm_hv:kvm_guest_exit) is used as a default event for
recording data, perf kvm report will display output as a list of
tracepoint hits and not with a normal report columns.

This patch will replace 'overhead,comm,dso,sym' fields instead of 'trace'
while displaying perf kvm report on powerpc.

Before applying patch:

  $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules 
report --stdio
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 181K of event 'kvm_hv:kvm_guest_exit'
  # Event count (approx.): 181061
  #
  # Overhead  Trace output
  #   
.
  #
   0.02%  VCPU 8: trap=HV_DECREMENTER pc=0xc0091924 
msr=0x80009032, ceded=0
   0.00%  VCPU 0: trap=HV_DECREMENTER pc=0xc0091924 
msr=0x80009032, ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x10005c7c msr=0x8280f032, 
ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x1001ef14 msr=0x8280f032, 
ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x3fff83398830 
msr=0x8280f032, ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x3fff833a6fe4 
msr=0x8280f032, ceded=0
   0.00%  VCPU 8: trap=HV_DECREMENTER pc=0x3fff833a7a64 
msr=0x8280f032, ceded=0

After applying patch:

  $ ./perf kvm --guestkallsyms=guest.kallsyms --guestmodules=guest.modules 
report --stdio
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 181K of event 'kvm_hv:kvm_guest_exit'
  # Event count (approx.): 181061
  #
  # Overhead  Command  Shared ObjectSymbol
  #   ...  ...  ..
  #
   0.02%  :57276   [guest.kernel.kallsyms]  [g] .plpar_hcall_norets
   0.00%  :57274   [guest.kernel.kallsyms]  [g] .plpar_hcall_norets
   0.00%  :57276   [guest.kernel.kallsyms]  [g] .__copy_tofrom_user_power7
   0.00%  :57276   [guest.kernel.kallsyms]  [g] ._atomic_dec_and_lock
   0.00%  :57276   [guest.kernel.kallsyms]  [g] ._raw_spin_lock
   0.00%  :57276   [guest.kernel.kallsyms]  [g] ._switch
   0.00%  :57276   [guest.kernel.kallsyms]  [g] .bio_add_page
   0.00%  :57276   [guest.kernel.kallsyms]  [g] .kmem_cache_alloc

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
changes in v2:
- Fixes output format of perf kvm report on powerpc

 tools/perf/arch/powerpc/util/kvm.c | 30 ++
 tools/perf/builtin-kvm.c   | 23 +--
 tools/perf/builtin.h   |  3 +++
 3 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/kvm.c 
b/tools/perf/arch/powerpc/util/kvm.c
index 317f29a..e5d88cc 100644
--- a/tools/perf/arch/powerpc/util/kvm.c
+++ b/tools/perf/arch/powerpc/util/kvm.c
@@ -8,11 +8,13 @@
  */
 
 #include 
+#include 
 #include "../../../util/evsel.h"
 #include "../../../util/evlist.h"
 #include "../../../util/trace-event.h"
 #include "../../../util/session.h"
 #include "../../../util/util.h"
+#include "../../../builtin.h"
 
 #define KVMPPC_EXIT "kvm_hv:kvm_guest_exit"
 #define HV_DECREMENTER 2432
@@ -102,3 +104,31 @@ u8 arch__get_cpumode(const union perf_event *event, struct 
perf_evsel *evsel,
 ret:
return cpumode;
 }
+
+const char **arch__cmd_kvm_report_argv(const char *file_name, int argc,
+  int *rec_argc, const char **argv)
+{
+   int i = 0, j, arch_argc = 0;
+   const char **rec_argv;
+
+   if (perf_guest_only())
+   arch_argc = 2;
+
+   *rec_argc = argc + arch_argc + 2;
+   rec_argv = calloc(*rec_argc + 1, sizeof(char *));
+   rec_argv[i++] = strdup("report");
+   rec_argv[i++] = strdup("-i");
+   rec_argv[i++] = strdup(file_name);
+
+   if (arch_argc) {
+   rec_argv[i++] = strdup("-F");
+   rec_argv[i++] = strdup("overhead,comm,dso,sym");
+   }
+
+   for (j = 1; j < argc; j++, i++)
+   rec_argv[i] = argv[j];
+
+   BUG_ON(i != *rec_argc);
+
+   return rec_argv;
+}
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 4418d92..48455c9 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1480,22 +1480,33 @@ static int __cmd_record(const char *file_name, int 
argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
 }
 
-static int __cmd_report(const char *file_name, int argc, const char **argv)
+
+const char 

[PATCH v2 0/3] perf kvm: Guest Symbol Resolution for powerpc

2016-01-21 Thread Ravi Bangoria
]  [g] fast_exception_return
   0.00%  :9690[unknown][u] 0x3fff966eb6a0
   0.00%  :9690[unknown][u] 0x3fff966fd09c
   0.00%  :9687[guest.kernel.kallsyms]  [g] .__copy_tofrom_user_power7
   0.00%  :9688[guest.kernel.kallsyms]  [g] ._raw_spin_lock_irqsave
   0.00%  :9688[guest.kernel.kallsyms]  [g] .n_tty_write
   0.00%  :9688[guest.kernel.kallsyms]  [g] .plpar_hcall
   0.00%  :9689[guest.kernel.kallsyms]  [g] .__srcu_read_unlock
   0.00%  :9689[guest.kernel.kallsyms]  [g] ._raw_spin_lock
   0.00%  :9689[guest.kernel.kallsyms]  [g] .arch_local_irq_restore

Ravi Bangoria (3):
  perf kvm: Introduce evsel as argument to perf_event__preprocess_sample
  perf kvm: enable record|report feature on powerpc
  perf kvm: Fix output fields instead of 'trace' for perf kvm report on
powerpc

 tools/perf/arch/powerpc/util/Build |   1 +
 tools/perf/arch/powerpc/util/kvm.c | 134 +
 tools/perf/builtin-annotate.c  |   3 +-
 tools/perf/builtin-diff.c  |   3 +-
 tools/perf/builtin-kvm.c   |  23 +--
 tools/perf/builtin-mem.c   |  10 +--
 tools/perf/builtin-report.c|   3 +-
 tools/perf/builtin-script.c|   3 +-
 tools/perf/builtin-timechart.c |   8 ++-
 tools/perf/builtin-top.c   |   3 +-
 tools/perf/builtin.h   |   3 +
 tools/perf/tests/hists_cumulate.c  |   2 +-
 tools/perf/tests/hists_filter.c|   2 +-
 tools/perf/tests/hists_link.c  |   4 +-
 tools/perf/tests/hists_output.c|   2 +-
 tools/perf/util/event.c|  15 -
 tools/perf/util/event.h|   3 +-
 tools/perf/util/evlist.c   |   9 +++
 tools/perf/util/evlist.h   |   1 +
 tools/perf/util/evsel.c|   7 ++
 tools/perf/util/evsel.h|   4 ++
 tools/perf/util/session.c  |   9 +--
 tools/perf/util/util.c |   5 ++
 tools/perf/util/util.h |   1 +
 24 files changed, 227 insertions(+), 31 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/kvm.c

--
2.1.4



[PATCH v2 2/3] perf kvm: enable record|report feature on powerpc

2016-01-21 Thread Ravi Bangoria
This patch contains core logic for enabling perf kvm {record|report} on
powerpc.

For perf kvm record,
This patch will replace default event(cycle) with kvm_hv:kvm_guest_exit
while recording guest data from host.

For perf kvm report,
This patch makes use of the 'kvm_guest_exit' tracepoint and checks the
exit reason for any kvm exit. If it is HV_DECREMENTER, then the
instruction pointer dumped along with this tracepoint is retrieved and
mapped with the guest kallsyms.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
---
changes in v2:
- Breakdown of v1 patch into two sub patches
- Merged parse-tp.c and evlist.c from tools/perf/arch/powerpc/util/ into
  single file with name kvm.c

 tools/perf/arch/powerpc/util/Build |   1 +
 tools/perf/arch/powerpc/util/kvm.c | 104 +
 tools/perf/util/event.c|  12 -
 tools/perf/util/evlist.c   |   9 
 tools/perf/util/evlist.h   |   1 +
 tools/perf/util/evsel.c|   7 +++
 tools/perf/util/evsel.h|   4 ++
 tools/perf/util/session.c  |   9 ++--
 tools/perf/util/util.c |   5 ++
 tools/perf/util/util.h |   1 +
 10 files changed, 147 insertions(+), 6 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/kvm.c

diff --git a/tools/perf/arch/powerpc/util/Build 
b/tools/perf/arch/powerpc/util/Build
index 7b8b0d1..eb819e0 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -1,5 +1,6 @@
 libperf-y += header.o
 libperf-y += sym-handling.o
+libperf-y += kvm.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
diff --git a/tools/perf/arch/powerpc/util/kvm.c 
b/tools/perf/arch/powerpc/util/kvm.c
new file mode 100644
index 000..317f29a
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/kvm.c
@@ -0,0 +1,104 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright (C) 2016 Hemant Kumar Shaw, IBM Corporation
+ * Copyright (C) 2016 Ravikumar B. Bangoria, IBM Corporation
+ */
+
+#include 
+#include "../../../util/evsel.h"
+#include "../../../util/evlist.h"
+#include "../../../util/trace-event.h"
+#include "../../../util/session.h"
+#include "../../../util/util.h"
+
+#define KVMPPC_EXIT "kvm_hv:kvm_guest_exit"
+#define HV_DECREMENTER 2432
+#define HV_BIT 3
+#define PR_BIT 49
+#define PPC_MAX 63
+
+/*
+ * To sample for only guest, record kvm_hv:kvm_guest_exit.
+ * Otherwise go via normal way(cycles).
+ */
+int perf_evlist__arch_add_default(struct perf_evlist *evlist)
+{
+   struct perf_evsel *evsel;
+
+   if (!perf_guest_only())
+   return -1;
+
+   evsel = perf_evsel__newtp_idx("kvm_hv", "kvm_guest_exit", 0);
+   if (IS_ERR(evsel))
+   return PTR_ERR(evsel);
+
+   perf_evlist__add(evlist, evsel);
+   return 0;
+}
+
+static bool is_kvmppc_exit_event(struct perf_evsel *evsel)
+{
+   static unsigned int kvmppc_exit;
+
+   if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
+   return false;
+
+   if (unlikely(kvmppc_exit == 0)) {
+   if (strcmp(KVMPPC_EXIT, evsel->name))
+   return false;
+   kvmppc_exit = evsel->attr.config;
+   } else if (kvmppc_exit != evsel->attr.config) {
+   return false;
+   }
+
+   return true;
+}
+
+static bool is_hv_dec_trap(struct perf_evsel *evsel, struct perf_sample 
*sample)
+{
+   int trap = perf_evsel__intval(evsel, sample, "trap");
+   return trap == HV_DECREMENTER;
+}
+
+/*
+ * Get the instruction pointer from the tracepoint data
+ */
+u64 arch__get_ip(struct perf_evsel *evsel, struct perf_sample *sample)
+{
+   if (perf_guest_only() &&
+   is_kvmppc_exit_event(evsel) &&
+   is_hv_dec_trap(evsel, sample))
+   return perf_evsel__intval(evsel, sample, "pc");
+
+   return sample->ip;
+}
+
+/*
+ * Get the HV and PR bits and accordingly, determine the cpumode
+ */
+u8 arch__get_cpumode(const union perf_event *event, struct perf_evsel *evsel,
+struct perf_sample *sample)
+{
+   unsigned long hv, pr, msr;
+   u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+
+   if (!perf_guest_only() || !is_kvmppc_exit_event(evsel))
+   goto ret;
+
+   if (sample->raw_data && is_hv_dec_trap(evsel, sample)) {
+   msr = perf_evsel__intval(evsel, sample, "msr");
+   hv = msr & ((unsigned long)1 << (PPC_MAX - HV_BIT));
+   pr = msr & ((unsigned long)1 << (PPC_MAX -

Re: [PATCH] perf/kvm: Guest Symbol Resolution for powerpc

2016-01-21 Thread Ravi Bangoria

Hi Arnaldo,

On Wednesday 13 January 2016 10:29 PM, Arnaldo Carvalho de Melo wrote:

Em Tue, Dec 29, 2015 at 03:38:40PM +0530, Ravi Bangoria escreveu:

'perf kvm {record|report}' is used to record and report the profiled
performance of any workload on a guest. From the host, we can collect
guest kernel statistics which is useful in finding out any contentions
in guest kernel symbols for a certain workload.
This feature is not available on powerpc because 'perf' relies on the
'cycles' event (a PMU event) to profile the guest. However, for powerpc,
this can't be used from the host because the PMUs are controlled by the
guest rather than the host.

Without entering the realms if the approach is the right one, which I
leave to PowerPC experts, Ingo, PeterZ, etc:

So, in these cases, please break this into a series, where you, for
instance, will add that extra evsel parameter to the functions that will
ultimately use it to extract those event fields, that should be a
separate patch, so that when reviewing the "meat" of your patch we can
quickly see what it does, not having to extract that from leg work.

Two other patches should introduce arch__get_{ip,cpumode}().

- Arnaldo


Thanks for suggestion. I've sent v2 with changes you suggested.

Can you please take a look.

Regards,
Ravi



Re: [RFC 4/4] perf kvm: Fix output fields instead of 'trace' for perf kvm report on powerpc

2016-03-08 Thread Ravi Bangoria

Hi Arnaldo,

Gentle reminder :)  Any updates?

Regards,
Ravi

On Thursday 03 March 2016 06:49 AM, Ravi Bangoria wrote:

Thanks acme,

On Wednesday 02 March 2016 09:52 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Mar 02, 2016 at 09:16:48PM +0530, Ravi Bangoria escreveu:

Thanks Arnaldo,

Please find my comments.

On Wednesday 02 March 2016 07:55 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Feb 24, 2016 at 02:37:45PM +0530, Ravi Bangoria escreveu:

  use_browser = 0;
+if (!field_order &&
+is_perf_data_reorded_on_ppc(session->evlist) &&
+perf_guest_only())
+field_order = "overhead,comm,dso,sym";
+

Can you please do it as:

__weak void arch__override_field_order(struct perf_evlist *evlist, 
const char **field_order)

{
}
So you mean like this - Just implement only weak function and move 
code into

it?
ie. No strong implementation at this point of time.

Like,

__weak void arch__override_field_order(struct perf_evlist *evlist, 
const

char **f_order)
{
 if (!field_order &&
 is_perf_data_reorded_on_ppc(session->evlist) &&

Oh, I see, ugh, when running on x86_64 we wouldn't use this, so we need
to have per arch default field orders, now I have to recall why is it
that we need this per-arch field order :-\


Sorry, I'm little bit confused. We need arch specific functionality 
present

on all arch to make cross arch reporting possible.

for example, record perf.data on ppc and report on x86, we need
ppc specific function present in perf binary compiled on x86.

Please let me know if I understood it wrong.

Regads,
Ravi





Re: [RFC 4/4] perf kvm: Fix output fields instead of 'trace' for perf kvm report on powerpc

2016-03-02 Thread Ravi Bangoria

Thanks Arnaldo,

Please find my comments.

On Wednesday 02 March 2016 07:55 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Feb 24, 2016 at 02:37:45PM +0530, Ravi Bangoria escreveu:

use_browser = 0;
  
+	if (!field_order &&

+   is_perf_data_reorded_on_ppc(session->evlist) &&
+   perf_guest_only())
+   field_order = "overhead,comm,dso,sym";
+

Can you please do it as:

__weak void arch__override_field_order(struct perf_evlist *evlist, const char 
**field_order)
{
}


So you mean like this - Just implement only weak function and move code 
into it?

ie. No strong implementation at this point of time.

Like,

__weak void arch__override_field_order(struct perf_evlist *evlist, const 
char **f_order)

{
if (!field_order &&
is_perf_data_reorded_on_ppc(session->evlist) &&
perf_guest_only())
*field_order = "overhead,comm,dso,sym";
}

Then I can do that.

But if you are proposing to implement a strong function and move this code
into in, then we won't be able to enable cross arch reporting.



This way we don't see any arch specific stuff in the tool, also I
haven't seen any doc update, are you sure nothing needs to be added to
tools/perf/Documentaton/ for any of these patches?

I think this needs to be documented further, probably in
tools/perf/design.txt too?


Yes, I'll do this in next version.

Regards,
Ravi



Re: [RFC 4/4] perf kvm: Fix output fields instead of 'trace' for perf kvm report on powerpc

2016-03-02 Thread Ravi Bangoria

Thanks acme,

On Wednesday 02 March 2016 09:52 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Mar 02, 2016 at 09:16:48PM +0530, Ravi Bangoria escreveu:

Thanks Arnaldo,

Please find my comments.

On Wednesday 02 March 2016 07:55 PM, Arnaldo Carvalho de Melo wrote:

Em Wed, Feb 24, 2016 at 02:37:45PM +0530, Ravi Bangoria escreveu:

use_browser = 0;
+   if (!field_order &&
+   is_perf_data_reorded_on_ppc(session->evlist) &&
+   perf_guest_only())
+   field_order = "overhead,comm,dso,sym";
+

Can you please do it as:

__weak void arch__override_field_order(struct perf_evlist *evlist, const char 
**field_order)
{
}

So you mean like this - Just implement only weak function and move code into
it?
ie. No strong implementation at this point of time.

Like,

__weak void arch__override_field_order(struct perf_evlist *evlist, const
char **f_order)
{
 if (!field_order &&
 is_perf_data_reorded_on_ppc(session->evlist) &&

Oh, I see, ugh, when running on x86_64 we wouldn't use this, so we need
to have per arch default field orders, now I have to recall why is it
that we need this per-arch field order :-\


Sorry, I'm little bit confused. We need arch specific functionality present
on all arch to make cross arch reporting possible.

for example, record perf.data on ppc and report on x86, we need
ppc specific function present in perf binary compiled on x86.

Please let me know if I understood it wrong.

Regads,
Ravi



[PATCH] hw_breakpoint: Fix Oops at destroying hw_breakpoint event on powerpc

2016-03-02 Thread Ravi Bangoria
At a time of destroying hw_breakpoint event, kernel ends up with Oops.
Here is the sample output from 4.5.0-rc6 kernel.

  [  450.708568] Unable to handle kernel paging request for data at address 
0x0c07
  [  450.708684] Faulting instruction address: 0xc00291d0
  [  450.708750] Oops: Kernel access of bad area, sig: 11 [#1]
  [  450.708798] SMP NR_CPUS=1024 NUMA pSeries
  [  450.708856] Modules linked in: 
stap_4c2bdcf3e1aee79b646bb9a844e600f7__4962(O) xt_CHECKSUM ...
  [  450.709539] CPU: 5 PID: 5106 Comm: perf_fuzzer Tainted: G   O
4.5.0-rc5+ #1
  [  450.709620] task: c000f8795c80 ti: c000e334 task.ti: 
c000e334
  [  450.709691] NIP: c00291d0 LR: c020b6b4 CTR: 
c020b6f0
  [  450.709760] REGS: c000e3343760 TRAP: 0300   Tainted: G   O 
(4.5.0-rc5+)
  [  450.709831] MSR: 80009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22008828  
XER: 2000
  [  450.710001] CFAR: c0010708 DAR: 0c07 DSISR: 4200 
SOFTE: 1
  GPR00: c020b6b4 c000e33439e0 c1350900 c0009efa7000
  GPR04: 0001 c0009efa7000  0001
  GPR08:    
  GPR12: c020b6f0 c7e02800 c0009efa5208 
  GPR16: 0001   c000f3ad7f10
  GPR20: c000f87964c8 0001 c000f8795c80 fffd
  GPR24:  c000f3ad7f08 c000f3ad7f68 c0009efa6800
  GPR28: c000f3ad7f00 c0009efa5000 c1259520 c0009efa7000
  [  450.710996] NIP [c00291d0] arch_unregister_hw_breakpoint+0x40/0x60
  [  450.711066] LR [c020b6b4] release_bp_slot+0x44/0x80
  [  450.77] Call Trace:
  [  450.711165] [c000e33439e0] [c09c1e38] mutex_lock+0x28/0x70 
(unreliable)
  [  450.711257] [c000e3343a10] [c020b6b4] release_bp_slot+0x44/0x80
  [  450.711332] [c000e3343a40] [c02036c8] _free_event+0xd8/0x350
  [  450.711404] [c000e3343a70] [c0208260] 
perf_event_exit_task+0x2b0/0x4c0
  [  450.711490] [c000e3343b20] [c00b8ac8] do_exit+0x388/0xc60
  [  450.711563] [c000e3343be0] [c00b9484] do_group_exit+0x64/0x100
  [  450.711641] [c000e3343c20] [c00c9100] get_signal+0x220/0x770
  [  450.711716] [c000e3343d10] [c0017884] do_signal+0x54/0x2b0
  [  450.711793] [c000e3343e00] [c0017cac] 
do_notify_resume+0xbc/0xd0
  [  450.711865] [c000e3343e30] [c0009838] 
ret_from_except_lite+0x64/0x68
  [  450.711948] Instruction dump:
  [  450.711986] f8010010 f821ffd1 7c7f1b78 6000 6000 e93f01e8 2fa9 
419e0018
  [  450.712107] e9290098 2fa9 419e000c 3940  38210030 
e8010010 ebe1fff8
  [  450.712230] ---[ end trace 3cf087de955e9358 ]---

Call chain:

  hw_breakpoint_event_init()
bp->destroy = bp_perf_event_destroy;

  do_exit()
perf_event_exit_task()
  perf_event_exit_task_context()
WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
perf_event_exit_event()
  free_event()
_free_event()
  bp_perf_event_destroy()//event->destroy(event);
release_bp_slot()
  arch_unregister_hw_breakpoint()

perf_event_exit_task_context sets child_ctx->task as TASK_TOMBSTONE
which is (void *)-1. arch_unregister_hw_breakpoint tries to fetch
'thread' attribute of 'task' resulting in Oops.

This patch adds one more condition before accessing data from 'task'.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/hw_breakpoint.c | 3 ++-
 include/linux/perf_event.h  | 2 ++
 kernel/events/core.c| 2 --
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 05e804c..43d8496 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -110,7 +111,7 @@ void arch_unregister_hw_breakpoint(struct perf_event *bp)
 * and the single_step_dabr_instruction(), then cleanup the breakpoint
 * restoration variables to prevent dangling pointers.
 */
-   if (bp->ctx && bp->ctx->task)
+   if (bp->ctx && bp->ctx->task && bp->ctx->task != TASK_TOMBSTONE)
bp->ctx->task->thread.last_hit_ubp = NULL;
 }
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f5c5a3f..491c50e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1192,4 +1192,6 @@ _name##_show(struct device *dev,  
\
\
 static struct device_attribute format_

Re: [RFC 1/4] perf kvm: Enable 'record' on powerpc

2016-03-28 Thread Ravi Bangoria

Thanks Arnaldo for putting the effort.

I've tested this patch on powerpc and it looks fine to me. Please find 
my below comments.


On Friday 25 March 2016 02:45 AM, Arnaldo Carvalho de Melo wrote:

Em Tue, Mar 22, 2016 at 11:19:21PM -0300, Arnaldo Carvalho de Melo escreveu:

Em Tue, Mar 22, 2016 at 04:12:11PM -0300, Arnaldo Carvalho de Melo escreveu:

Em Wed, Feb 24, 2016 at 02:37:42PM +0530, Ravi Bangoria escreveu:

'perf kvm record' is not available on powerpc because 'perf' relies on
the 'cycles' event (a PMU event) to profile the guest. However, for
powerpc, this can't be used from the host because the PMUs are controlled
by the guest rather than the host.

There exists a tracepoint 'kvm_hv:kvm_guest_exit' in powerpc which is
hit whenever any of the threads exit the guest context. The guest
instruction pointer dumped along with this tracepoint data in the field
'pc', can be used as guest instruction pointer.

This patch changes default event as kvm_hv:kvm_guest_exit for recording
guest data in host on powerpc. As we are using host event to record guest
data, this approach will enable only --guest option of 'perf kvm'. Still
--host --guest together won't work.

It should, i.e. --host --guest should translate to:

-e cycles:H,kvm_hv:kvm_guest_exit

I.e. both collect cycles only in the host, and also the tracepoint that
will allow us to get the guest approximation for the unavailable cycles
event, no?

I'm putting the infrastructure work needed for this the perf/cpumode
branch. More work will be put there soon.

So I took a different path and made perf_evsel__parse_sample set a new
perf_sample.cpumode field, this way we'll end up having just to set a
per-evsel ->post_parse_sample() callback for the event that replaces
"cycles" for PPC guests where we'll just set data->ip and data->cpumode,
the rest of the code remains unchanged.

The changes I made looks useful in itself, as, IIRC more code was
removed than added.

I'll continue tomorrow and will test with the kvm:kvm_exit on x86_64 for
testing, that has:

Ok, so the infrastructure got merged already and from there the next
steps are in running with:

  perf kvm --guest record -a -e cycles:H,kvm:kvm_exit

And then, with the patch below applied, try:

perf kvm --guestkallsyms kallsyms.guest --guestmodules modules.guest report -i 
perf.data.guest --munge-ppc-guest-sample kvm:kvm_exit



The initial proposal was to change the default event as "kvm_guest_exit" 
for kvm recording/reporting
on ppc. If I understand it correctly, your patch creates a handler for 
reporting kvm events
based on "munge_ppc_guest_event" and the required tracepoint i.e., we 
need to mention the

required tracepoint event name for recording and reporting.

There might be a little bit of an issue here. For scripts which depend 
on generic perf kvm record/report,
we need to change those appropriately to prevent those from failing on 
powerpc. Otherwise, (just a
thought) can we create some kind of an alias to map the ppc specific 
perf kvm commands with the

generic perf kvm.
For e.g :
perf kvm record -e "kvm_hv:kvm_guest_exit"  mapped to  perf kvm record
&
perf kvm report --munge-ppc-guest-sample kvm_hv:kvm_guest_exit mapped 
to  perf kvm report.



Regards,
Ravi



Re: [RFC] perf probe: Fix module probe issue if no dwarf support

2016-04-26 Thread Ravi Bangoria



On Tuesday 26 April 2016 02:59 AM, Masami Hiramatsu wrote:

On Mon, 25 Apr 2016 16:08:28 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.

Before applying patch:

   $ dpkg -s libdw-dev
 dpkg-query: package 'libdw-dev' is not installed ...

   $ ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
 Added new event:
   probe:foo_show (on foo_show in /linux/samples/kobject/kobject-example.ko)

 You can now use it in all perf tools, such as:

   perf record -e probe:foo_show -aR sleep 1

   $ cat /sys/kernel/debug/tracing/kprobe_events
 p:probe/foo_show /linux/samples/kobject/kobject-example.ko:foo_show

After applying patch:

   $ ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
 Added new event:
   probe:foo_show (on foo_show in kobject_example)

 You can now use it in all perf tools, such as:

   perf record -e probe:foo_show -aR sleep 1

   $ cat /sys/kernel/debug/tracing/kprobe_events
 p:probe/foo_show kobject_example:foo_show


Looks good to me :)
However, it seems that this patch depends on your previous patch
("perf probe: Fix offline module name missmatch issue")
In that case, could you make these a series of patches?

Acked-by: Masami Hiramatsu <mhira...@kernel.org>


Thanks Masami,

I've sent v2 with changes you suggested. Please review it.

Regards,
Ravi



Re: [RFC] perf probe: Fix offline module name missmatch issue

2016-04-26 Thread Ravi Bangoria

Thanks Masami for reviewing.

Please find my replies to your comment.

On Tuesday 26 April 2016 02:54 AM, Masami Hiramatsu wrote:

Hi Ravi,

On Mon, 25 Apr 2016 16:08:27 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Perf can add a probe on kernel module which has not been loaded yet.
Current implementation finds module name from path. But if filename
is different from actual module name then perf fails to register
probe while loading module because of mismatch in names. For example,
samples/kobject/kobject-example.ko is loaded as kobject_example.

Ah! right, good catch!
Have some comment below;


diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8319fbb..05d0905 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -265,6 +265,65 @@ static bool kprobe_warn_out_range(const char *symbol, 
unsigned long address)
return true;
  }
  
+/*

+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is not
+ * exposed to user-space. Offset of 'name' has remained same from long
+ * time, so hardcoding it here.
+ */
+#ifdef __LP64__
+#define MOD_NAME_OFFSET 24
+#else
+#define MOD_NAME_OFFSET 12
+#endif
+
+/*
+ * @module can be module name of module file path. In case of path,
+ * inspect elf and find out what is actual module name.
+ * Caller has to free mod_name after using it.
+ */
+char *find_module_name(const char *module)

Could you make this function static, since there is no caller outside
this file?


Yes. no caller outside of this file. But,

In this patch, function find_module_name is defined outside of
#ifdef HAVE_DWARF_SUPPORT while it's being called from inside of
#ifdef HAVE_DWARF_SUPPORT.

If I make it static and if there is no dwarf support, there will be 
compilation

error about function defined but not used.

And in second patch("perf probe: Fix module probe issue if no dwarf 
support"),

I'm calling it from outside of #ifdef HAVE_DWARF_SUPPORT.

So I have two options:
1. merge both the patches and make definition as static
2. make function static in second patch

I've chose second approach and sent v2. But please let me know if there is
better way to do it.

Regards,
Ravi



Re: [RFC] perf probe: Fix offline module name missmatch issue

2016-04-26 Thread Ravi Bangoria



On Tuesday 26 April 2016 02:45 PM, Wangnan (F) wrote:



On 2016/4/26 16:56, Ravi Bangoria wrote:

Thanks Masami for reviewing.

Please find my replies to your comment.

On Tuesday 26 April 2016 02:54 AM, Masami Hiramatsu wrote:

Hi Ravi,

On Mon, 25 Apr 2016 16:08:27 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Perf can add a probe on kernel module which has not been loaded yet.
Current implementation finds module name from path. But if filename
is different from actual module name then perf fails to register
probe while loading module because of mismatch in names. For example,
samples/kobject/kobject-example.ko is loaded as kobject_example.

Ah! right, good catch!
Have some comment below;

diff --git a/tools/perf/util/probe-event.c 
b/tools/perf/util/probe-event.c

index 8319fbb..05d0905 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -265,6 +265,65 @@ static bool kprobe_warn_out_range(const char 
*symbol, unsigned long address)

  return true;
  }
  +/*
+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is 
not
+ * exposed to user-space. Offset of 'name' has remained same from 
long

+ * time, so hardcoding it here.
+ */
+#ifdef __LP64__
+#define MOD_NAME_OFFSET 24
+#else
+#define MOD_NAME_OFFSET 12
+#endif
+
+/*
+ * @module can be module name of module file path. In case of path,
+ * inspect elf and find out what is actual module name.
+ * Caller has to free mod_name after using it.
+ */
+char *find_module_name(const char *module)

Could you make this function static, since there is no caller outside
this file?


Yes. no caller outside of this file. But,

In this patch, function find_module_name is defined outside of
#ifdef HAVE_DWARF_SUPPORT while it's being called from inside of
#ifdef HAVE_DWARF_SUPPORT.

If I make it static and if there is no dwarf support, there will be 
compilation

error about function defined but not used.

And in second patch("perf probe: Fix module probe issue if no dwarf 
support"),

I'm calling it from outside of #ifdef HAVE_DWARF_SUPPORT.

So I have two options:
1. merge both the patches and make definition as static
2. make function static in second patch

I've chose second approach and sent v2. But please let me know if 
there is

better way to do it.



Try __maybe_unused directive?



Thanks Wangnan for suggestion,

Actually I tried to use __maybe_unused with definition of 
find_module_name but

it throws following compilation error:

util/probe-event.c:289:1: error: expected ‘,’ or ‘;’ before ‘{’ token
 {
 ^
util/probe-event.c:288:14: error: ‘find_module_name’ declared ‘static’ 
but never defined [-Werror=unused-function]

 static char *find_module_name(const char *module) __maybe_unused
  ^
  CC   util/zlib.o
cc1: all warnings being treated as errors


I've to declare prototype of function with __maybe_unused before it's 
definition to
resolve this error. And, anyway this is temporary and need to be removed 
in patch 2,

I think no need to do this change.

Regards,
Ravi



Re: [RFC] perf probe: Fix offline module name missmatch issue

2016-04-26 Thread Ravi Bangoria

Thanks Masami,

On Tuesday 26 April 2016 07:49 AM, Masami Hiramatsu wrote:

On Tue, 26 Apr 2016 06:24:38 +0900
Masami Hiramatsu  wrote:

+/*
+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is not
+ * exposed to user-space. Offset of 'name' has remained same from long
+ * time, so hardcoding it here.
+ */

BTW, is there no way to get the module name avoiding to access
this "hidden" data structure?
This looks very tricky way...


So this is the same approach kernel use to find module name when module is
loaded. Please refer this function for more detail:

kernel/module.c ::  static struct module *setup_load_info(...)

Regards,
Ravi



[PATCH 2/2] perf probe: Fix offline module name missmatch issue

2016-04-26 Thread Ravi Bangoria
Perf can add a probe on kernel module which has not been loaded yet.
Current implementation finds module name from path. But if filename
is different from actual module name then perf fails to register
probe while loading module because of mismatch in names. For example,
samples/kobject/kobject-example.ko is loaded as kobject_example.

Before applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show   (on foo_show in kobject-example)

You can now use it in all perf tools, such as:

perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject-example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
Module  Size  Used by
kobject_example16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.093 MB perf.data ]

  $./perf report --stdio -F overhead,comm,dso,sym
Error:
The perf.data.old file has no samples!

After applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show   (on foo_show in kobject_example)

You can now use it in all perf tools, such as:

perf record -e probe:foo_show -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject_example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
Module  Size  Used by
kobject_example16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]

  $ sudo ./perf report  --stdio -F overhead,comm,dso,sym
...
# Samples: 8  of event 'probe:foo_show'
# Event count (approx.): 8
#
# Overhead  Command  Shared Object  Symbol
#   ...  .  
#
   100.00%  cat  [kobject_example]  [k] foo_show

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index d58de20..26803e1 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -642,32 +642,23 @@ static int add_module_to_probe_trace_events(struct 
probe_trace_event *tevs,
int ntevs, const char *module)
 {
int i, ret = 0;
-   char *tmp;
+   char *mod_name = NULL;
 
if (!module)
return 0;
 
-   tmp = strrchr(module, '/');
-   if (tmp) {
-   /* This is a module path -- get the module name */
-   module = strdup(tmp + 1);
-   if (!module)
-   return -ENOMEM;
-   tmp = strchr(module, '.');
-   if (tmp)
-   *tmp = '\0';
-   tmp = (char *)module;   /* For free() */
-   }
+   mod_name = find_module_name(module);
 
for (i = 0; i < ntevs; i++) {
-   tevs[i].point.module = strdup(module);
+   tevs[i].point.module =
+   strdup(mod_name ? mod_name : module);
if (!tevs[i].point.module) {
ret = -ENOMEM;
break;
}
}
 
-   free(tmp);
+   free(mod_name);
return ret;
 }
 
-- 
1.9.1



[PATCH 1/2] perf probe: Fix module probe issue if no dwarf support

2016-04-26 Thread Ravi Bangoria
Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.

Before applying patch:

  $ dpkg -s libdw-dev
  dpkg-query: package 'libdw-dev' is not installed and no information is...

  $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
  Added new event:
probe:kprobe_init (on kprobe_init in 
/linux/samples/kprobes/kprobe_example.ko)

  You can now use it in all perf tools, such as:

  perf record -e probe:kprobe_init -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/kprobe_init /linux/samples/kprobes/kprobe_example.ko:kprobe_init

  $ sudo ./perf record -a -e probe:kprobe_init
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.105 MB perf.data ]

  $ sudo ./perf script  # No output here 

After applying patch:

  $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
  Added new event:
probe:kprobe_init(on kprobe_init in kprobe_example)

  You can now use it in all perf tools, such as:

  perf record -e probe:kprobe_init -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/kprobe_init kprobe_example:kprobe_init

  $ sudo ./perf record -a -e probe:kprobe_init
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ]

  $ sudo ./perf script
  insmod 13990 [002]  5961.216833: probe:kprobe_init: ...
  insmod 13995 [002]  5962.889384: probe:kprobe_init: ...

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 76 +--
 1 file changed, 73 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8319fbb..d58de20 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -265,6 +265,65 @@ static bool kprobe_warn_out_range(const char *symbol, 
unsigned long address)
return true;
 }
 
+/*
+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is not
+ * exposed to user-space. Offset of 'name' has remained same from long
+ * time, so hardcoding it here.
+ */
+#ifdef __LP64__
+#define MOD_NAME_OFFSET 24
+#else
+#define MOD_NAME_OFFSET 12
+#endif
+
+/*
+ * @module can be module name of module file path. In case of path,
+ * inspect elf and find out what is actual module name.
+ * Caller has to free mod_name after using it.
+ */
+static char *find_module_name(const char *module)
+{
+   int fd;
+   Elf *elf;
+   GElf_Ehdr ehdr;
+   GElf_Shdr shdr;
+   Elf_Data *data;
+   Elf_Scn *sec;
+   char *mod_name = NULL;
+
+   fd = open(module, O_RDONLY);
+   if (fd < 0)
+   return NULL;
+
+   elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+   if (elf == NULL)
+   goto elf_err;
+
+   if (gelf_getehdr(elf, ) == NULL)
+   goto ret_err;
+
+   sec = elf_section_by_name(elf, , ,
+   ".gnu.linkonce.this_module", NULL);
+   if (!sec)
+   goto ret_err;
+
+   data = elf_getdata(sec, NULL);
+   if (!data || !data->d_buf)
+   goto ret_err;
+
+   mod_name = strdup((char *)data->d_buf + MOD_NAME_OFFSET);
+
+ret_err:
+   elf_end(elf);
+elf_err:
+   close(fd);
+   return mod_name;
+}
+
 #ifdef HAVE_DWARF_SUPPORT
 
 static int kernel_get_module_dso(const char *module, struct dso **pdso)
@@ -2516,6 +2575,7 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
struct probe_trace_point *tp;
int num_matched_functions;
int ret, i, j, skipped = 0;
+   char *mod_name;
 
map = get_target_map(pev->target, pev->uprobes);
if (!map) {
@@ -2600,9 +2660,19 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
tp->realname = strdup_or_goto(sym->name, nomem_out);
 
tp->retprobe = pp->retprobe;
-   if (pev->target)
-   tev->point.module = strdup_or_goto(pev->target,
-  nomem_out);
+   if (pev->target) {
+   if (pev->uprobes) {
+   tev->point.module = strdup_or_goto(pev->target,
+  nomem_out);
+   } else {
+   mod_name = find_module_name(pev->target);
+  

Re: [PATCH 1/2] perf probe: Fix module probe issue if no dwarf support

2016-04-26 Thread Ravi Bangoria



On Tuesday 26 April 2016 08:04 PM, Arnaldo Carvalho de Melo wrote:

Em Tue, Apr 26, 2016 at 07:55:40PM +0530, Ravi Bangoria escreveu:

Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.

Is this v3? What has changed from v2?

- Arnaldo


Yes Arnaldo. But I changed it from [RFC] to [PATCH], so didn't include
version. Here is [RFC v2] link: https://lkml.org/lkml/2016/4/26/114

Changes w.r.t. [RFC v2]:
  - Swapped patch in series and move definition of find_module_name in 
other patch.


Regards,
Ravi



Re: [RFC] perf probe: Fix offline module name missmatch issue

2016-04-26 Thread Ravi Bangoria



On Tuesday 26 April 2016 04:16 PM, Masami Hiramatsu wrote:

On Tue, 26 Apr 2016 14:26:48 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Thanks Masami for reviewing.

Please find my replies to your comment.

On Tuesday 26 April 2016 02:54 AM, Masami Hiramatsu wrote:

Hi Ravi,

On Mon, 25 Apr 2016 16:08:27 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Perf can add a probe on kernel module which has not been loaded yet.
Current implementation finds module name from path. But if filename
is different from actual module name then perf fails to register
probe while loading module because of mismatch in names. For example,
samples/kobject/kobject-example.ko is loaded as kobject_example.

Ah! right, good catch!
Have some comment below;


diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8319fbb..05d0905 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -265,6 +265,65 @@ static bool kprobe_warn_out_range(const char *symbol, 
unsigned long address)
return true;
   }
   
+/*

+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is not
+ * exposed to user-space. Offset of 'name' has remained same from long
+ * time, so hardcoding it here.
+ */
+#ifdef __LP64__
+#define MOD_NAME_OFFSET 24
+#else
+#define MOD_NAME_OFFSET 12
+#endif
+
+/*
+ * @module can be module name of module file path. In case of path,
+ * inspect elf and find out what is actual module name.
+ * Caller has to free mod_name after using it.
+ */
+char *find_module_name(const char *module)

Could you make this function static, since there is no caller outside
this file?

Yes. no caller outside of this file. But,

In this patch, function find_module_name is defined outside of
#ifdef HAVE_DWARF_SUPPORT while it's being called from inside of
#ifdef HAVE_DWARF_SUPPORT.

If I make it static and if there is no dwarf support, there will be
compilation
error about function defined but not used.

And in second patch("perf probe: Fix module probe issue if no dwarf
support"),
I'm calling it from outside of #ifdef HAVE_DWARF_SUPPORT.

So I have two options:
1. merge both the patches and make definition as static
2. make function static in second patch

I've chose second approach and sent v2. But please let me know if there is
better way to do it.

Ah, I see.
In that case, you can swap the patch in the series and move find_module_name
in the other patch ;)


Thanks :)  Sent patch with changes. Please review it.

Regards,
Ravi



Re: [RFC 1/4] perf kvm: Enable 'record' on powerpc

2016-04-27 Thread Ravi Bangoria
0ce..ffe5bbb 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -367,4 +367,9 @@ typedef void (*print_binary_t)(enum binary_printer_ops,
 void print_binary(unsigned char *data, size_t len,
   size_t bytes_per_line, print_binary_t printer,
   void *extra);
+
+struct perf_evlist;
+
+int perf_kvm__setup_munge_ppc_guest_event(struct perf_evlist *evlist);
+
 #endif /* GIT_COMPAT_UTIL_H */




On Friday 25 March 2016 02:45 AM, Arnaldo Carvalho de Melo wrote:

Em Tue, Mar 22, 2016 at 11:19:21PM -0300, Arnaldo Carvalho de Melo escreveu:

Em Tue, Mar 22, 2016 at 04:12:11PM -0300, Arnaldo Carvalho de Melo escreveu:

Em Wed, Feb 24, 2016 at 02:37:42PM +0530, Ravi Bangoria escreveu:

'perf kvm record' is not available on powerpc because 'perf' relies on
the 'cycles' event (a PMU event) to profile the guest. However, for
powerpc, this can't be used from the host because the PMUs are controlled
by the guest rather than the host.

There exists a tracepoint 'kvm_hv:kvm_guest_exit' in powerpc which is
hit whenever any of the threads exit the guest context. The guest
instruction pointer dumped along with this tracepoint data in the field
'pc', can be used as guest instruction pointer.

This patch changes default event as kvm_hv:kvm_guest_exit for recording
guest data in host on powerpc. As we are using host event to record guest
data, this approach will enable only --guest option of 'perf kvm'. Still
--host --guest together won't work.

It should, i.e. --host --guest should translate to:

-e cycles:H,kvm_hv:kvm_guest_exit

I.e. both collect cycles only in the host, and also the tracepoint that
will allow us to get the guest approximation for the unavailable cycles
event, no?

I'm putting the infrastructure work needed for this the perf/cpumode
branch. More work will be put there soon.

So I took a different path and made perf_evsel__parse_sample set a new
perf_sample.cpumode field, this way we'll end up having just to set a
per-evsel ->post_parse_sample() callback for the event that replaces
"cycles" for PPC guests where we'll just set data->ip and data->cpumode,
the rest of the code remains unchanged.

The changes I made looks useful in itself, as, IIRC more code was
removed than added.

I'll continue tomorrow and will test with the kvm:kvm_exit on x86_64 for
testing, that has:

Ok, so the infrastructure got merged already and from there the next
steps are in running with:

  perf kvm --guest record -a -e cycles:H,kvm:kvm_exit

And then, with the patch below applied, try:

perf kvm --guestkallsyms kallsyms.guest --guestmodules modules.guest report -i 
perf.data.guest --munge-ppc-guest-sample kvm:kvm_exit

I'm almost there, it is still not resolving to the kernel DSO, etc, so I
get:

Samples: 1K of event 'kvm:kvm_exit', Event count (approx.): 1924
Overhead  Command  Shared Object Symbol
   34.77%  :5343[unknown] [g] 0x81043158
   16.84%  :5345[unknown] [g] 0x813f3d5a
   16.84%  :5345[unknown] [g] 0x813f43ec
   13.83%  :5345[unknown] [g] 0x81043158
9.62%  :5343[unknown] [g] 0x8104301a
3.79%  :5345[unknown] [g] 0x8104301a
1.77%  :5345[unknown] [u] 0x003ae6c75dc9
0.52%  :5343[unknown] [g] 0x812a29b1
0.16%  :5343[unknown] [g] 0x8100bc00
0.10%  :5343[unknown] [g] 0x8104315a
0.10%  :5343[unknown] [g] 0x8106306f
0.10%  :5343[unknown] [g] 0x8153b7fc
0.10%  :5345[unknown] [g] 0x8106306f
0.05%  :5343[unknown] [g] 0x8100b720

[root@jouet ~]# cat /proc/*/task/5343/comm
qemu-system-x86
[root@jouet ~]#

The patch does several of the things you did per sample, but only right after
opening the perf.data file, and I'll break it down in multiple patches, this is
just a heads up, please review it if you have the time, in the end we should
have a mechanism useful not just for PPC and that affects just 'perf kvm' in
this specific case.

- Arnaldo

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index bff666458b28..b7b6527446f8 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1480,6 +1480,86 @@ perf_stat:
  }
  #endif /* HAVE_KVM_STAT_SUPPORT */

+#ifdef __powerpc__
+#define PPC_HV_DECREMENTER 2432
+#define PPC_HV_BIT 3
+#define PPC_PR_BIT 49
+#define PPC_MAX 63
+
+static bool perf_sample__is_hv_dec_trap(struct perf_sample *sample, struct 
perf_evsel *evsel)
+{
+   int trap = perf_evsel__intval(evsel, sample, "trap");
+   return trap == PPC_HV_DECREMENTER;
+}
+
+static void perf_kvm__munge_ppc_guest_sample(struct perf_evsel *evsel, struct 
perf_sample *sample)
+{
+   unsigned long msr, hv, pr;
+
+   if (!perf_sample__is_hv_dec_trap(sample, evsel))
+   return;
+
+   sample->ip = perf_evsel__intval

[RFC] perf probe: Fix offline module name missmatch issue

2016-04-25 Thread Ravi Bangoria
Perf can add a probe on kernel module which has not been loaded yet.
Current implementation finds module name from path. But if filename
is different from actual module name then perf fails to register
probe while loading module because of mismatch in names. For example,
samples/kobject/kobject-example.ko is loaded as kobject_example.

Before applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show   (on foo_show in kobject-example)

You can now use it in all perf tools, such as:

perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject-example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
Module  Size  Used by
kobject_example16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.093 MB perf.data ]

  $./perf report --stdio -F overhead,comm,dso,sym
Error:
The perf.data.old file has no samples!

After applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show   (on foo_show in kobject_example)

You can now use it in all perf tools, such as:

perf record -e probe:foo_show -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject_example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
Module  Size  Used by
kobject_example16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]

  $ sudo ./perf report  --stdio -F overhead,comm,dso,sym
...
# Samples: 8  of event 'probe:foo_show'
# Event count (approx.): 8
#
# Overhead  Command  Shared Object  Symbol
#   ...  .  
#
   100.00%  cat  [kobject_example]  [k] foo_show

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 78 +++
 tools/perf/util/probe-event.h |  2 ++
 2 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8319fbb..05d0905 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -265,6 +265,65 @@ static bool kprobe_warn_out_range(const char *symbol, 
unsigned long address)
return true;
 }
 
+/*
+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is not
+ * exposed to user-space. Offset of 'name' has remained same from long
+ * time, so hardcoding it here.
+ */
+#ifdef __LP64__
+#define MOD_NAME_OFFSET 24
+#else
+#define MOD_NAME_OFFSET 12
+#endif
+
+/*
+ * @module can be module name of module file path. In case of path,
+ * inspect elf and find out what is actual module name.
+ * Caller has to free mod_name after using it.
+ */
+char *find_module_name(const char *module)
+{
+   int fd;
+   Elf *elf;
+   GElf_Ehdr ehdr;
+   GElf_Shdr shdr;
+   Elf_Data *data;
+   Elf_Scn *sec;
+   char *mod_name = NULL;
+
+   fd = open(module, O_RDONLY);
+   if (fd < 0)
+   return NULL;
+
+   elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+   if (elf == NULL)
+   goto elf_err;
+
+   if (gelf_getehdr(elf, ) == NULL)
+   goto ret_err;
+
+   sec = elf_section_by_name(elf, , ,
+   ".gnu.linkonce.this_module", NULL);
+   if (!sec)
+   goto ret_err;
+
+   data = elf_getdata(sec, NULL);
+   if (!data || !data->d_buf)
+   goto ret_err;
+
+   mod_name = strdup((char *)data->d_buf + MOD_NAME_OFFSET);
+
+ret_err:
+   elf_end(elf);
+elf_err:
+   close(fd);
+   return mod_name;
+}
+
 #ifdef HAVE_DWARF_SUPPORT
 
 static int kernel_get_module_dso(const char *module, struct dso **pdso)
@@ -583,32 +642,23 @@ static int add_module_to_probe_trace_events(struct 
probe_trace_event *tevs,
int ntevs, const char *module)
 {
int i, ret = 0;
-   char *tmp;
+   char *mod_name;
 
if (!module)
return 0;
 
-   tmp = strrchr(module, '/');
-   if (tmp) {
-   /* This is a module path -- get the module name */
-   module = strdup(tmp + 1);
- 

[RFC] perf probe: Fix module probe issue if no dwarf support

2016-04-25 Thread Ravi Bangoria
Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.

Before applying patch:

  $ dpkg -s libdw-dev
dpkg-query: package 'libdw-dev' is not installed ...

  $ ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show (on foo_show in /linux/samples/kobject/kobject-example.ko)

You can now use it in all perf tools, such as:

  perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show /linux/samples/kobject/kobject-example.ko:foo_show

After applying patch:

  $ ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show (on foo_show in kobject_example)

You can now use it in all perf tools, such as:

  perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject_example:foo_show

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 05d0905..54e6a5a 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2566,6 +2566,7 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
struct probe_trace_point *tp;
int num_matched_functions;
int ret, i, j, skipped = 0;
+   char *mod_name;
 
map = get_target_map(pev->target, pev->uprobes);
if (!map) {
@@ -2650,9 +2651,19 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
tp->realname = strdup_or_goto(sym->name, nomem_out);
 
tp->retprobe = pp->retprobe;
-   if (pev->target)
-   tev->point.module = strdup_or_goto(pev->target,
-  nomem_out);
+   if (pev->target) {
+   if (pev->uprobes) {
+   tev->point.module = strdup_or_goto(pev->target,
+  nomem_out);
+   } else {
+   mod_name = find_module_name(pev->target);
+   tev->point.module =
+   strdup(mod_name ? mod_name : 
pev->target);
+   free(mod_name);
+   if (!tev->point.module)
+   goto nomem_out;
+   }
+   }
tev->uprobes = pev->uprobes;
tev->nargs = pev->nargs;
if (tev->nargs) {
-- 
2.1.4



[RFC v2 1/2] perf probe: Fix offline module name missmatch issue

2016-04-26 Thread Ravi Bangoria
Perf can add a probe on kernel module which has not been loaded yet.
Current implementation finds module name from path. But if filename
is different from actual module name then perf fails to register
probe while loading module because of mismatch in names. For example,
samples/kobject/kobject-example.ko is loaded as kobject_example.

Before applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show   (on foo_show in kobject-example)

You can now use it in all perf tools, such as:

perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject-example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
Module  Size  Used by
kobject_example16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.093 MB perf.data ]

  $./perf report --stdio -F overhead,comm,dso,sym
Error:
The perf.data.old file has no samples!

After applying patch:

  $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show   (on foo_show in kobject_example)

You can now use it in all perf tools, such as:

perf record -e probe:foo_show -aR sleep 1

  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject_example:foo_show

  $ insmod kobject-example.ko

  $ lsmod
Module  Size  Used by
kobject_example16384  0

  Generate read to /sys/kernel/kobject_example/foo while recording data
  with below command
  $ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]

  $ sudo ./perf report  --stdio -F overhead,comm,dso,sym
...
# Samples: 8  of event 'probe:foo_show'
# Event count (approx.): 8
#
# Overhead  Command  Shared Object  Symbol
#   ...  .  
#
   100.00%  cat  [kobject_example]  [k] foo_show

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 78 +++
 tools/perf/util/probe-event.h |  2 ++
 2 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8319fbb..5f1a9bf 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -265,6 +265,65 @@ static bool kprobe_warn_out_range(const char *symbol, 
unsigned long address)
return true;
 }
 
+/*
+ * NOTE:
+ * '.gnu.linkonce.this_module' section of kernel module elf directly
+ * maps to 'struct module' from linux/module.h. This section contains
+ * actual module name which will be used by kernel after loading it.
+ * But, we cannot use 'struct module' here since linux/module.h is not
+ * exposed to user-space. Offset of 'name' has remained same from long
+ * time, so hardcoding it here.
+ */
+#ifdef __LP64__
+#define MOD_NAME_OFFSET 24
+#else
+#define MOD_NAME_OFFSET 12
+#endif
+
+/*
+ * @module can be module name of module file path. In case of path,
+ * inspect elf and find out what is actual module name.
+ * Caller has to free mod_name after using it.
+ */
+char *find_module_name(const char *module)
+{
+   int fd;
+   Elf *elf;
+   GElf_Ehdr ehdr;
+   GElf_Shdr shdr;
+   Elf_Data *data;
+   Elf_Scn *sec;
+   char *mod_name = NULL;
+
+   fd = open(module, O_RDONLY);
+   if (fd < 0)
+   return NULL;
+
+   elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+   if (elf == NULL)
+   goto elf_err;
+
+   if (gelf_getehdr(elf, ) == NULL)
+   goto ret_err;
+
+   sec = elf_section_by_name(elf, , ,
+   ".gnu.linkonce.this_module", NULL);
+   if (!sec)
+   goto ret_err;
+
+   data = elf_getdata(sec, NULL);
+   if (!data || !data->d_buf)
+   goto ret_err;
+
+   mod_name = strdup((char *)data->d_buf + MOD_NAME_OFFSET);
+
+ret_err:
+   elf_end(elf);
+elf_err:
+   close(fd);
+   return mod_name;
+}
+
 #ifdef HAVE_DWARF_SUPPORT
 
 static int kernel_get_module_dso(const char *module, struct dso **pdso)
@@ -583,32 +642,23 @@ static int add_module_to_probe_trace_events(struct 
probe_trace_event *tevs,
int ntevs, const char *module)
 {
int i, ret = 0;
-   char *tmp;
+   char *mod_name;
 
if (!module)
return 0;
 
-   tmp = strrchr(module, '/');
-   if (tmp) {
-   /* This is a module path -- get the module name */
-   module = strdup(tmp + 1);
- 

[RFC v2 2/2] perf probe: Fix module probe issue if no dwarf support

2016-04-26 Thread Ravi Bangoria
Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.

Before applying patch:

  $ dpkg -s libdw-dev
dpkg-query: package 'libdw-dev' is not installed ...

  $ ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show (on foo_show in /linux/samples/kobject/kobject-example.ko)

You can now use it in all perf tools, such as:

  perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show /linux/samples/kobject/kobject-example.ko:foo_show

After applying patch:

  $ ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
  probe:foo_show (on foo_show in kobject_example)

You can now use it in all perf tools, such as:

  perf record -e probe:foo_show -aR sleep 1

  $ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject_example:foo_show

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - made find_module_name static

 tools/perf/util/probe-event.c | 19 +++
 tools/perf/util/probe-event.h |  2 --
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 5f1a9bf..c570a6c 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -285,7 +285,7 @@ static bool kprobe_warn_out_range(const char *symbol, 
unsigned long address)
  * inspect elf and find out what is actual module name.
  * Caller has to free mod_name after using it.
  */
-char *find_module_name(const char *module)
+static char *find_module_name(const char *module)
 {
int fd;
Elf *elf;
@@ -2566,6 +2566,7 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
struct probe_trace_point *tp;
int num_matched_functions;
int ret, i, j, skipped = 0;
+   char *mod_name;
 
map = get_target_map(pev->target, pev->uprobes);
if (!map) {
@@ -2650,9 +2651,19 @@ static int find_probe_trace_events_from_map(struct 
perf_probe_event *pev,
tp->realname = strdup_or_goto(sym->name, nomem_out);
 
tp->retprobe = pp->retprobe;
-   if (pev->target)
-   tev->point.module = strdup_or_goto(pev->target,
-  nomem_out);
+   if (pev->target) {
+   if (pev->uprobes) {
+   tev->point.module = strdup_or_goto(pev->target,
+  nomem_out);
+   } else {
+   mod_name = find_module_name(pev->target);
+   tev->point.module =
+   strdup(mod_name ? mod_name : 
pev->target);
+   free(mod_name);
+   if (!tev->point.module)
+   goto nomem_out;
+   }
+   }
tev->uprobes = pev->uprobes;
tev->nargs = pev->nargs;
if (tev->nargs) {
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 0468fa3..e54e7b0 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -166,6 +166,4 @@ int e_snprintf(char *str, size_t size, const char *format, 
...)
 int copy_to_probe_trace_arg(struct probe_trace_arg *tvar,
struct perf_probe_arg *pvar);
 
-char *find_module_name(const char *module);
-
 #endif /*_PROBE_EVENT_H */
-- 
2.1.4



Re: [RFC 1/4] perf kvm: Enable 'record' on powerpc

2016-05-09 Thread Ravi Bangoria

Hi Arnaldo,

Thanks for the review.  Please find my comments below.

On Thursday 28 April 2016 03:17 AM, Arnaldo Carvalho de Melo wrote:

Em Wed, Apr 27, 2016 at 06:02:21PM +0530, Ravi Bangoria escreveu:

Hi Arnaldo,

I've worked on your patch. I'm sending this patch(diff) to check if this
is the same idea you want to progress with. I cleanup your patch,
removed arch specific compile time directives and changed code to
enable cross arch reporting. I tested record on powerpc and report
on x86 and it's working.

Please give suggestion about your approach. Let me know if you have
some other idea to progress with.

Here is the diff w.r.t perf/cpumode branch:

diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index bff6664..83ef6c6 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1480,6 +1480,60 @@ perf_stat:
  }
  #endif /* HAVE_KVM_STAT_SUPPORT */

+#define PPC_HV_DECREMENTER 2432
+#define PPC_HV_BIT 3
+#define PPC_PR_BIT 49
+#define PPC_MAX 63
+
+static bool perf_sample__is_hv_dec_trap(struct perf_sample *sample, struct
perf_evsel *evsel)
+{
+int trap = perf_evsel__intval(evsel, sample, "trap");
+return trap == PPC_HV_DECREMENTER;
+}
+
+static void perf_kvm__munge_ppc_guest_sample(struct perf_evsel *evsel,
struct perf_sample *sample)
+{
+unsigned long msr, hv, pr;
+
+if (!perf_sample__is_hv_dec_trap(sample, evsel))
+return;
+
+sample->ip = perf_evsel__intval(evsel, sample, "pc");
+sample->cpumode = PERF_RECORD_MISC_GUEST_KERNEL;
+
+msr = perf_evsel__intval(evsel, sample, "msr");
+hv = msr & ((unsigned long)1 << (PPC_MAX - PPC_HV_BIT));
+pr = msr & ((unsigned long)1 << (PPC_MAX - PPC_PR_BIT));
+if (!hv && pr)
+sample->cpumode = PERF_RECORD_MISC_GUEST_USER;
+}
+
+static bool perf_evlist__recorded_on_ppc(const struct perf_evlist *evlist)
+{
+if (evlist->env && evlist->env->arch) {
+return !strcmp(evlist->env->arch, "ppc64") ||
+   !strcmp(evlist->env->arch, "ppc64le");
+}
+return false;
+}
+
+int perf_kvm__setup_munge_ppc_guest_event(struct perf_evlist *evlist)
+{
+struct perf_evsel *evsel;
+const char name[] = "kvm_hv:kvm_guest_exit";
+
+if (!perf_evlist__recorded_on_ppc(evlist))
+return 0;
+
+evsel = perf_evlist__find_tracepoint_by_name(evlist, name);
+if (evsel == NULL)
+return -1;
+
+evsel->munge_sample = perf_kvm__munge_ppc_guest_sample;
+
+return 0;
+}
+
  static int __cmd_record(const char *file_name, int argc, const char **argv)
  {
  int rec_argc, i = 0, j;
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ab47273..7cb41f7 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -879,6 +879,12 @@ repeat:
  if (session == NULL)
  return -1;

+if (perf_guest &&
+perf_kvm__setup_munge_ppc_guest_event(session->evlist)) {
+pr_err("PPC event not present in %s file\n", input_name);
+goto error;
+}

This looks out of place, i.e. this reads: "For all cases where there is
a guest and we can't setup the ppc KVM guest related stuff, its an
error"

I think this gets clearer as:

if (perf_guest && perf_evlist__recorded_on_ppc(evlist) &&
perf_kvm__setup_munge_ppc_guest_event(session->evlist)) {
pr_err("PPC event not present in %s file\n", input_name);
goto error;
}

Then we read this as "Hey, if this was recorded on ppc, try to set
things up for ppc",


Yes I'll change this.


  but then again, what is this KVM stuff doing in the
generic 'perf report' code?


Basically we are checking if data recorded on ppc in perf.data file. Which
can be done after opening a file and mapping header info in evlist. And
evlist is initialized in builtin-record.c only. So, I don't see any 
possibility to

move this in builtin-kvm.c. Kindly guide how can we do it.


What if this is a perf.data file generated on PPC but being read on PPC?
This will not make sense to munge it, right?


If you are asking about normal(without kvm) perf record and report, it's
working with this patch. Otherwise can you please explain little bit more.

But yes, we can change this code like this:

if (perf_guest && perf_evlist__recorded_on_ppc(session->evlist))
perf_kvm__setup_munge_ppc_guest_event(session->evlist);

and change definition of perf_kvm__setup_munge_ppc_guest_event as:

void perf_kvm__setup_munge_ppc_guest_event(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
const char name[] = "kvm_hv:kvm_guest_exit";

evsel = perf_evlist__find_tracepoint_by_name(evlist, name);
if (evsel == NULL)
return;

evsel->munge_sample = perf_kvm__munge_ppc_guest_sample;
}



This is with what I remember from this case, please bear with me.


Regards,
Ravi



[RFC] perf uprobe: Skip prologue if program compiled without optimization

2016-07-28 Thread Ravi Bangoria
Function prologue prepares stack and registers before executing function
logic. When target program is compiled without optimization, function
parameter information is only valid after prologue. When we probe entrypc
of the function, and try to record function parameter, it contains
garbage value.

For example,
  $ vim test.c
#include 

void foo(int i)
{
   printf("i: %d\n", i);
}

int main()
{
  foo(42);
  return 0;
}

  $ gcc -g test.c -o test
  $ objdump -dl test | less
foo():
/home/ravi/test.c:4
  400536:   55  push   %rbp
  400537:   48 89 e5mov%rsp,%rbp
  40053a:   48 83 ec 10 sub-bashx10,%rsp
  40053e:   89 7d fcmov%edi,-0x4(%rbp)
/home/ravi/test.c:5
  400541:   8b 45 fcmov-0x4(%rbp),%eax
...
...
main():
/home/ravi/test.c:9
  400558:   55  push   %rbp
  400559:   48 89 e5mov%rsp,%rbp
/home/ravi/test.c:10
  40055c:   bf 2a 00 00 00  mov-bashx2a,%edi
  400561:   e8 d0 ff ff ff  callq  400536 
/home/ravi/test.c:11

  $ ./perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
 p:probe_test/foo /home/ravi/test:0x0536 i=-12(%sp):s32

  $ ./perf record -e probe_test:foo ./test
  $ ./perf script
 test  5778 [001]  4918.562027: probe_test:foo: (400536) i=0

Here variable 'i' is passed via stack which is pushed on stack at
0x40053e. But we are probing at 0x400536.

To resolve this issues, we need to probe on next instruction after
prologue. gdb and systemtap also does same thing. I've implemented
this patch based on approach systemtap has used.

After applying patch:

  $ ./perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
p:probe_test/foo /home/ravi/test:0x0541 i=-4(%bp):s32

  $ ./perf record -e probe_test:foo ./test
  $ ./perf script
test  6300 [001]  5877.879327: probe_test:foo: (400541) i=42

No need to skip prologue for optimized case since debug info is correct
for each instructions for -O2 -g. For more details please visit:
https://bugzilla.redhat.com/show_bug.cgi?id=612253#c6

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-finder.c | 156 +
 1 file changed, 156 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f2d9ff0..a788b9c2 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -892,6 +892,161 @@ static int find_probe_point_lazy(Dwarf_Die *sp_die, 
struct probe_finder *pf)
return die_walk_lines(sp_die, probe_point_lazy_walker, pf);
 }
 
+static bool var_has_loclist(Dwarf_Die *die)
+{
+   Dwarf_Attribute loc;
+   int tag = dwarf_tag(die);
+
+   if (tag != DW_TAG_formal_parameter &&
+   tag != DW_TAG_variable)
+   return false;
+
+   return (dwarf_attr_integrate(die, DW_AT_location, ) &&
+   dwarf_whatform() == DW_FORM_sec_offset);
+}
+
+/*
+ * For any object in given CU whose DW_AT_location is a location list,
+ * target program is compiled with optimization.
+ */
+static bool optimized_target(Dwarf_Die *die)
+{
+   Dwarf_Die tmp_die;
+
+   if (var_has_loclist(die))
+   return true;
+
+   if (!dwarf_child(die, _die) && optimized_target(_die))
+   return true;
+
+   if (!dwarf_siblingof(die, _die) && optimized_target(_die))
+   return true;
+
+   return false;
+}
+
+static bool get_entrypc_idx(Dwarf_Lines *lines, unsigned long nr_lines,
+   Dwarf_Addr pf_addr, unsigned long *entrypc_idx)
+{
+   unsigned long i;
+   Dwarf_Addr addr;
+
+   for (i = 0; i < nr_lines; i++) {
+   if (dwarf_lineaddr(dwarf_onesrcline(lines, i), ))
+   return false;
+
+   if (addr == pf_addr) {
+   *entrypc_idx = i;
+   return true;
+   }
+   }
+   return false;
+}
+
+static bool get_postprologue_addr(unsigned long entrypc_idx,
+ Dwarf_Lines *lines,
+ unsigned long nr_lines,
+ Dwarf_Addr highpc,
+ Dwarf_Addr *postprologue_addr)
+{
+   unsigned long i;
+   int entrypc_lno, lno;
+   Dwarf_Line *line;
+   Dwarf_Addr addr;
+   bool p_end;
+
+   /* entrypc_lno is actual source line number */
+   line = dwarf_onesrcline(lines, entrypc_idx);
+   if (dwarf_lineno(line, _lno))
+   return false;
+
+   for (i = entrypc_idx; i < nr_lines; i++) {
+   line = dwarf_onesrclin

[PATCH] perf uprobe: Skip prologue if program compiled without optimization

2016-08-01 Thread Ravi Bangoria
Function prologue prepares stack and registers before executing function
logic. When target program is compiled without optimization, function
parameter information is only valid after prologue. When we probe entrypc
of the function, and try to record function parameter, it contains
garbage value.

For example,
  $ vim test.c
#include 

void foo(int i)
{
   printf("i: %d\n", i);
}

int main()
{
  foo(42);
  return 0;
}

  $ gcc -g test.c -o test
  $ objdump -dl test | less
foo():
/home/ravi/test.c:4
  400536:   55  push   %rbp
  400537:   48 89 e5mov%rsp,%rbp
  40053a:   48 83 ec 10 sub-bashx10,%rsp
  40053e:   89 7d fcmov%edi,-0x4(%rbp)
/home/ravi/test.c:5
  400541:   8b 45 fcmov-0x4(%rbp),%eax
...
...
main():
/home/ravi/test.c:9
  400558:   55  push   %rbp
  400559:   48 89 e5mov%rsp,%rbp
/home/ravi/test.c:10
  40055c:   bf 2a 00 00 00  mov-bashx2a,%edi
  400561:   e8 d0 ff ff ff  callq  400536 
/home/ravi/test.c:11

  $ ./perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
 p:probe_test/foo /home/ravi/test:0x0536 i=-12(%sp):s32

  $ ./perf record -e probe_test:foo ./test
  $ ./perf script
 test  5778 [001]  4918.562027: probe_test:foo: (400536) i=0

Here variable 'i' is passed via stack which is pushed on stack at
0x40053e. But we are probing at 0x400536.

To resolve this issues, we need to probe on next instruction after
prologue. gdb and systemtap also does same thing. I've implemented
this patch based on approach systemtap has used.

After applying patch:

  $ ./perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
p:probe_test/foo /home/ravi/test:0x0541 i=-4(%bp):s32

  $ ./perf record -e probe_test:foo ./test
  $ ./perf script
test  6300 [001]  5877.879327: probe_test:foo: (400541) i=42

No need to skip prologue for optimized case since debug info is correct
for each instructions for -O2 -g. For more details please visit:
https://bugzilla.redhat.com/show_bug.cgi?id=612253#c6

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes wrt RFC:
  - Skip prologue only when function parameter is specified
  - Notify user about skipping prologue

 tools/perf/util/probe-finder.c | 164 +
 1 file changed, 164 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f2d9ff0..8efa7f2 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -892,6 +892,169 @@ static int find_probe_point_lazy(Dwarf_Die *sp_die, 
struct probe_finder *pf)
return die_walk_lines(sp_die, probe_point_lazy_walker, pf);
 }
 
+static bool var_has_loclist(Dwarf_Die *die)
+{
+   Dwarf_Attribute loc;
+   int tag = dwarf_tag(die);
+
+   if (tag != DW_TAG_formal_parameter &&
+   tag != DW_TAG_variable)
+   return false;
+
+   return (dwarf_attr_integrate(die, DW_AT_location, ) &&
+   dwarf_whatform() == DW_FORM_sec_offset);
+}
+
+/*
+ * For any object in given CU whose DW_AT_location is a location list,
+ * target program is compiled with optimization.
+ */
+static bool optimized_target(Dwarf_Die *die)
+{
+   Dwarf_Die tmp_die;
+
+   if (var_has_loclist(die))
+   return true;
+
+   if (!dwarf_child(die, _die) && optimized_target(_die))
+   return true;
+
+   if (!dwarf_siblingof(die, _die) && optimized_target(_die))
+   return true;
+
+   return false;
+}
+
+static bool get_entrypc_idx(Dwarf_Lines *lines, unsigned long nr_lines,
+   Dwarf_Addr pf_addr, unsigned long *entrypc_idx)
+{
+   unsigned long i;
+   Dwarf_Addr addr;
+
+   for (i = 0; i < nr_lines; i++) {
+   if (dwarf_lineaddr(dwarf_onesrcline(lines, i), ))
+   return false;
+
+   if (addr == pf_addr) {
+   *entrypc_idx = i;
+   return true;
+   }
+   }
+   return false;
+}
+
+static bool get_postprologue_addr(unsigned long entrypc_idx,
+ Dwarf_Lines *lines,
+ unsigned long nr_lines,
+ Dwarf_Addr highpc,
+ Dwarf_Addr *postprologue_addr)
+{
+   unsigned long i;
+   int entrypc_lno, lno;
+   Dwarf_Line *line;
+   Dwarf_Addr addr;
+   bool p_end;
+
+   /* entrypc_lno is actual source line number */
+   line = dwarf_onesrcline(lines, entrypc_idx);
+   if (dwarf_lineno(line, _lno))
+   return fal

Re: [RFC] perf uprobe: Skip prologue if program compiled without optimization

2016-08-01 Thread Ravi Bangoria


On Saturday 30 July 2016 08:34 AM, Masami Hiramatsu wrote:

On Thu, 28 Jul 2016 20:01:51 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Function prologue prepares stack and registers before executing function
logic. When target program is compiled without optimization, function
parameter information is only valid after prologue. When we probe entrypc
of the function, and try to record function parameter, it contains
garbage value.

Right! :)


Thanks Masami for review.

I've sent patch with changes you suggested. Please review it.

-Ravi



[PATCH v2 2/2] perf uprobe: Skip prologue if program compiled without optimization

2016-08-03 Thread Ravi Bangoria
Function prologue prepares stack and registers before executing function
logic. When target program is compiled without optimization, function
parameter information is only valid after prologue. When we probe entrypc
of the function, and try to record function parameter, it contains
garbage value.

For example,
  $ vim test.c
#include 

void foo(int i)
{
   printf("i: %d\n", i);
}

int main()
{
  foo(42);
  return 0;
}

  $ gcc -g test.c -o test
  $ objdump -dl test | less
foo():
/home/ravi/test.c:4
  400536:   55  push   %rbp
  400537:   48 89 e5mov%rsp,%rbp
  40053a:   48 83 ec 10 sub-bashx10,%rsp
  40053e:   89 7d fcmov%edi,-0x4(%rbp)
/home/ravi/test.c:5
  400541:   8b 45 fcmov-0x4(%rbp),%eax
...
...
main():
/home/ravi/test.c:9
  400558:   55  push   %rbp
  400559:   48 89 e5mov%rsp,%rbp
/home/ravi/test.c:10
  40055c:   bf 2a 00 00 00  mov-bashx2a,%edi
  400561:   e8 d0 ff ff ff  callq  400536 
/home/ravi/test.c:11

  $ ./perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
 p:probe_test/foo /home/ravi/test:0x0536 i=-12(%sp):s32

  $ ./perf record -e probe_test:foo ./test
  $ ./perf script
 test  5778 [001]  4918.562027: probe_test:foo: (400536) i=0

Here variable 'i' is passed via stack which is pushed on stack at
0x40053e. But we are probing at 0x400536.

To resolve this issues, we need to probe on next instruction after
prologue. gdb and systemtap also does same thing. I've implemented
this patch based on approach systemtap has used.

After applying patch:

  $ ./perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
p:probe_test/foo /home/ravi/test:0x0541 i=-4(%bp):s32

  $ ./perf record -e probe_test:foo ./test
  $ ./perf script
test  6300 [001]  5877.879327: probe_test:foo: (400541) i=42

No need to skip prologue for optimized case since debug info is correct
for each instructions for -O2 -g. For more details please visit:
https://bugzilla.redhat.com/show_bug.cgi?id=612253#c6

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - Skipping prologue only when any ARG is either C variable, $params
or $vars.
  - Probe on line(:1) may not be always possible. Recommend only address
to force probe on function entry.

 tools/perf/util/probe-finder.c | 164 +
 1 file changed, 164 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f2d9ff0..b2bc77c 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -892,6 +892,169 @@ static int find_probe_point_lazy(Dwarf_Die *sp_die, 
struct probe_finder *pf)
return die_walk_lines(sp_die, probe_point_lazy_walker, pf);
 }
 
+static bool var_has_loclist(Dwarf_Die *die)
+{
+   Dwarf_Attribute loc;
+   int tag = dwarf_tag(die);
+
+   if (tag != DW_TAG_formal_parameter &&
+   tag != DW_TAG_variable)
+   return false;
+
+   return (dwarf_attr_integrate(die, DW_AT_location, ) &&
+   dwarf_whatform() == DW_FORM_sec_offset);
+}
+
+/*
+ * For any object in given CU whose DW_AT_location is a location list,
+ * target program is compiled with optimization.
+ */
+static bool optimized_target(Dwarf_Die *die)
+{
+   Dwarf_Die tmp_die;
+
+   if (var_has_loclist(die))
+   return true;
+
+   if (!dwarf_child(die, _die) && optimized_target(_die))
+   return true;
+
+   if (!dwarf_siblingof(die, _die) && optimized_target(_die))
+   return true;
+
+   return false;
+}
+
+static bool get_entrypc_idx(Dwarf_Lines *lines, unsigned long nr_lines,
+   Dwarf_Addr pf_addr, unsigned long *entrypc_idx)
+{
+   unsigned long i;
+   Dwarf_Addr addr;
+
+   for (i = 0; i < nr_lines; i++) {
+   if (dwarf_lineaddr(dwarf_onesrcline(lines, i), ))
+   return false;
+
+   if (addr == pf_addr) {
+   *entrypc_idx = i;
+   return true;
+   }
+   }
+   return false;
+}
+
+static bool get_postprologue_addr(unsigned long entrypc_idx,
+ Dwarf_Lines *lines,
+ unsigned long nr_lines,
+ Dwarf_Addr highpc,
+ Dwarf_Addr *postprologue_addr)
+{
+   unsigned long i;
+   int entrypc_lno, lno;
+   Dwarf_Line *line;
+   Dwarf_Addr addr;
+   bool p_end;
+
+   /* entrypc_lno is actual source line number */
+   line = dwarf_onesrcline

[PATCH 1/2] perf probe: Helper function to check if probe with variable

2016-08-03 Thread Ravi Bangoria
Introduce helper function instead of inline code and replace hardcoded
strings "$vars" and "$params" with their corresponding macros.

perf_probe_with_var is not declared as static since it will be called
from different file in subsequent patch.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 22 +++---
 tools/perf/util/probe-event.h |  2 ++
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 953dc1a..bc9317e 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1592,19 +1592,27 @@ out:
return ret;
 }
 
+/* Returns true if *any* ARG is either C variable, $params or $vars. */
+bool perf_probe_with_var(struct perf_probe_event *pev)
+{
+   int i = 0;
+
+   for (i = 0; i < pev->nargs; i++)
+   if (is_c_varname(pev->args[i].var)  ||
+   !strcmp(pev->args[i].var, PROBE_ARG_PARAMS) ||
+   !strcmp(pev->args[i].var, PROBE_ARG_VARS))
+   return true;
+   return false;
+}
+
 /* Return true if this perf_probe_event requires debuginfo */
 bool perf_probe_event_need_dwarf(struct perf_probe_event *pev)
 {
-   int i;
-
if (pev->point.file || pev->point.line || pev->point.lazy_line)
return true;
 
-   for (i = 0; i < pev->nargs; i++)
-   if (is_c_varname(pev->args[i].var) ||
-   !strcmp(pev->args[i].var, "$params") ||
-   !strcmp(pev->args[i].var, "$vars"))
-   return true;
+   if (perf_probe_with_var(pev))
+   return true;
 
return false;
 }
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index e18ea9f..4d1139b 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -128,6 +128,8 @@ char *synthesize_perf_probe_point(struct perf_probe_point 
*pp);
 int perf_probe_event__copy(struct perf_probe_event *dst,
   struct perf_probe_event *src);
 
+bool perf_probe_with_var(struct perf_probe_event *pev);
+
 /* Check the perf_probe_event needs debuginfo */
 bool perf_probe_event_need_dwarf(struct perf_probe_event *pev);
 
-- 
2.5.5



Re: [PATCH] perf uprobe: Skip prologue if program compiled without optimization

2016-08-03 Thread Ravi Bangoria

Thanks Masami,

On Tuesday 02 August 2016 08:22 PM, Masami Hiramatsu wrote:

On Mon,  1 Aug 2016 14:19:28 +0530
Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:


Function prologue prepares stack and registers before executing function
logic. When target program is compiled without optimization, function
parameter information is only valid after prologue. When we probe entrypc
of the function, and try to record function parameter, it contains
garbage value.


[SNIP]

+
+   /* Only FUNC and FUNC@SRC are eligible. */
+   if (!pp->function || pp->line || pp->retprobe || pp->lazy_line ||
+   pp->offset || pp->abs_address)
+   return;
+
+   /* Not interested in func parameter? */
+   if (!pf->pev->nargs)
+   return;

Hmm, this is not enough, since perf-probe accepts registers and stacks.
At least you should check if all argument are !is_c_varname(), !PROBE_ARG_VARS 
and
!PROBE_ARG_PARAMS here, instead of checking nargs.


+
+   pr_info("Target program is compiled without optimization. Skipping 
prologue.\n"
+   "Use %s:1 or absolute address 0x%lx to force probe on entry 
point.\n\n",

Hmm, is :1 always available? I think we should just recommend to use 
only
the address.
(moreover, pf->addr may not the absolute address in uprobe event, we'd better 
say
  "the address 0x%x")


Nice catch. :)

Sent v2. Please review it.

-Ravi




Re: [PATCH 2/2] perf ppc64le: Fix probe location when using DWARF

2016-08-11 Thread Ravi Bangoria



On Thursday 11 August 2016 05:20 PM, Arnaldo Carvalho de Melo wrote:

Em Thu, Aug 11, 2016 at 10:01:04AM +0530, Ravi Bangoria escreveu:


On Thursday 11 August 2016 05:24 AM, Anton Blanchard wrote:

Hi,


Powerpc has Global Entry Point and Local Entry Point for functions.
LEP catches call from both the GEP and the LEP. Symbol table of ELF
contains GEP and Offset from which we can calculate LEP, but debuginfo
does not have LEP info.

Currently, perf prioritize symbol table over dwarf to probe on LEP
for ppc64le. But when user tries to probe with function parameter,
we fall back to using dwarf(i.e. GEP) and when function called via
LEP, probe will never hit.

This patch causes a build failure for me on ppc64le:

libperf.a(libperf-in.o): In function `arch__post_process_probe_trace_events':

tools/perf/arch/powerpc/util/sym-handling.c:109: undefined reference to 
`get_target_map'

Thanks Anton. Sorry, I should have caught that.

@Arnaldo, Can you please pick this up. I've prepared this on top of
acme/perf/core.


 From 89c977ae9c3ae35c78b16cddabcf2b01d3cf5cc8 Mon Sep 17 00:00:00 2001
From: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
Date: Wed, 10 Aug 2016 23:13:45 -0500
Subject: [PATCH] perf ppc64le: Fix build failure when no dwarf support

Fix perf build failure on ppc64le because of Commit 99e608b5954c ("perf
probe ppc64le: Fix probe location when using DWARF")

Can you please provide a better explanation? I had to look at the patch
to understand what it was fixing, and then the patch adds LIBELF_SUPPORT
ifdefs while the patch description, talks about DWARF.


Yes. Explanation could have been better. Apologies for that.

arch__post_process_probe_trace_events() calls get_target_map() to prepare
symbol table. get_target_map() is defined inside util/probe-event.c.

probe-event.c will only get included in perf binary if CONFIG_LIBELF is set.
Hence arch__post_process_probe_trace_events() needs to be defined inside
#ifdef HAVE_LIBELF_SUPPORT to solve compilation error.

Please let me know if any doubts.

Thanks,
Ravi


Anyway, Anton, does this fix the problem for you?

- Arnaldo


Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
  tools/perf/arch/powerpc/util/sym-handling.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c
b/tools/perf/arch/powerpc/util/sym-handling.c
index 8d4dc97..c27a51a 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -97,6 +97,7 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,
 }
  }

+#ifdef HAVE_LIBELF_SUPPORT
  void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
int ntevs)
  {
@@ -118,5 +119,6 @@ void arch__post_process_probe_trace_events(struct
perf_probe_event *pev,
 }
 }
  }
+#endif

  #endif
--
2.7.4




Re: [PATCH v2 2/2] perf uprobe: Skip prologue if program compiled without optimization

2016-08-13 Thread Ravi Bangoria
Hi Masami, Arnaldo,

Any updates on this?

Thanks,
Ravi

On Wednesday 03 August 2016 02:28 PM, Ravi Bangoria wrote:
> Function prologue prepares stack and registers before executing function
> logic. When target program is compiled without optimization, function
> parameter information is only valid after prologue. When we probe entrypc
> of the function, and try to record function parameter, it contains
> garbage value.
>
> For example,
>   $ vim test.c
> #include 
>
> void foo(int i)
> {
>printf("i: %d\n", i);
> }
>
> int main()
> {
>   foo(42);
>   return 0;
> }
>
>   $ gcc -g test.c -o test
>   $ objdump -dl test | less
> foo():
> /home/ravi/test.c:4
>   400536:   55  push   %rbp
>   400537:   48 89 e5mov%rsp,%rbp
>   40053a:   48 83 ec 10 sub-bashx10,%rsp
>   40053e:   89 7d fcmov%edi,-0x4(%rbp)
> /home/ravi/test.c:5
>   400541:   8b 45 fcmov-0x4(%rbp),%eax
> ...
> ...
> main():
> /home/ravi/test.c:9
>   400558:   55  push   %rbp
>   400559:   48 89 e5mov%rsp,%rbp
> /home/ravi/test.c:10
>   40055c:   bf 2a 00 00 00  mov-bashx2a,%edi
>   400561:   e8 d0 ff ff ff  callq  400536 
> /home/ravi/test.c:11
>
>   $ ./perf probe -x ./test 'foo i'
>   $ cat /sys/kernel/debug/tracing/uprobe_events
>  p:probe_test/foo /home/ravi/test:0x0536 i=-12(%sp):s32
>
>   $ ./perf record -e probe_test:foo ./test
>   $ ./perf script
>  test  5778 [001]  4918.562027: probe_test:foo: (400536) i=0
>
> Here variable 'i' is passed via stack which is pushed on stack at
> 0x40053e. But we are probing at 0x400536.
>
> To resolve this issues, we need to probe on next instruction after
> prologue. gdb and systemtap also does same thing. I've implemented
> this patch based on approach systemtap has used.
>
> After applying patch:
>
>   $ ./perf probe -x ./test 'foo i'
>   $ cat /sys/kernel/debug/tracing/uprobe_events
> p:probe_test/foo /home/ravi/test:0x0541 i=-4(%bp):s32
>
>   $ ./perf record -e probe_test:foo ./test
>   $ ./perf script
> test  6300 [001]  5877.879327: probe_test:foo: (400541) i=42
>
> No need to skip prologue for optimized case since debug info is correct
> for each instructions for -O2 -g. For more details please visit:
> https://bugzilla.redhat.com/show_bug.cgi?id=612253#c6
>
> Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
> ---
> Changes in v2:
>   - Skipping prologue only when any ARG is either C variable, $params
> or $vars.
>   - Probe on line(:1) may not be always possible. Recommend only address
> to force probe on function entry.
>
>  tools/perf/util/probe-finder.c | 164 
> +
>  1 file changed, 164 insertions(+)
>
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index f2d9ff0..b2bc77c 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -892,6 +892,169 @@ static int find_probe_point_lazy(Dwarf_Die *sp_die, 
> struct probe_finder *pf)
>   return die_walk_lines(sp_die, probe_point_lazy_walker, pf);
>  }
>
> +static bool var_has_loclist(Dwarf_Die *die)
> +{
> + Dwarf_Attribute loc;
> + int tag = dwarf_tag(die);
> +
> + if (tag != DW_TAG_formal_parameter &&
> + tag != DW_TAG_variable)
> + return false;
> +
> + return (dwarf_attr_integrate(die, DW_AT_location, ) &&
> + dwarf_whatform() == DW_FORM_sec_offset);
> +}
> +
> +/*
> + * For any object in given CU whose DW_AT_location is a location list,
> + * target program is compiled with optimization.
> + */
> +static bool optimized_target(Dwarf_Die *die)
> +{
> + Dwarf_Die tmp_die;
> +
> + if (var_has_loclist(die))
> + return true;
> +
> + if (!dwarf_child(die, _die) && optimized_target(_die))
> + return true;
> +
> + if (!dwarf_siblingof(die, _die) && optimized_target(_die))
> + return true;
> +
> + return false;
> +}
> +
> +static bool get_entrypc_idx(Dwarf_Lines *lines, unsigned long nr_lines,
> + Dwarf_Addr pf_addr, unsigned long *entrypc_idx)
> +{
> + unsigned long i;
> + Dwarf_Addr addr;
> +
> + for (i = 0; i < nr_lines; i++) {
> + if (dwarf_lineaddr(dwarf_onesrcline(lines, i),

Re: [PATCH 2/2] perf ppc64le: Fix probe location when using DWARF

2016-08-10 Thread Ravi Bangoria



On Thursday 11 August 2016 05:24 AM, Anton Blanchard wrote:

Hi,


Powerpc has Global Entry Point and Local Entry Point for functions.
LEP catches call from both the GEP and the LEP. Symbol table of ELF
contains GEP and Offset from which we can calculate LEP, but debuginfo
does not have LEP info.

Currently, perf prioritize symbol table over dwarf to probe on LEP
for ppc64le. But when user tries to probe with function parameter,
we fall back to using dwarf(i.e. GEP) and when function called via
LEP, probe will never hit.

This patch causes a build failure for me on ppc64le:

libperf.a(libperf-in.o): In function `arch__post_process_probe_trace_events':

tools/perf/arch/powerpc/util/sym-handling.c:109: undefined reference to 
`get_target_map'


Thanks Anton. Sorry, I should have caught that.

@Arnaldo, Can you please pick this up. I've prepared this on top of 
acme/perf/core.



From 89c977ae9c3ae35c78b16cddabcf2b01d3cf5cc8 Mon Sep 17 00:00:00 2001
From: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
Date: Wed, 10 Aug 2016 23:13:45 -0500
Subject: [PATCH] perf ppc64le: Fix build failure when no dwarf support

Fix perf build failure on ppc64le because of Commit 99e608b5954c ("perf
probe ppc64le: Fix probe location when using DWARF")

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/sym-handling.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c

index 8d4dc97..c27a51a 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -97,6 +97,7 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,
}
 }

+#ifdef HAVE_LIBELF_SUPPORT
 void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
   int ntevs)
 {
@@ -118,5 +119,6 @@ void arch__post_process_probe_trace_events(struct 
perf_probe_event *pev,

}
}
 }
+#endif

 #endif
--
2.7.4



Re: [PATCH v3 3/4] perf annotate: add powerpc support

2016-07-13 Thread Ravi Bangoria



On Wednesday 13 July 2016 01:09 PM, Michael Ellerman wrote:

Arnaldo Carvalho de Melo <a...@kernel.org> writes:


Em Tue, Jul 12, 2016 at 07:51:46AM +0530, Ravi Bangoria escreveu:

Hi Arnaldo,

On Friday 08 July 2016 02:01 PM, Michael Ellerman wrote:

Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> writes:


On Wednesday 06 July 2016 03:38 PM, Michael Ellerman wrote:

I've sent v4 which enables annotate for bctr' instructions.

for 'bctr', it will show down arrow(indicate jump) and 'bctrl' will show
right arrow(indicate call). But no navigation options will be provided.
By pressing Enter key on that, message will be shown that like
"Invalid target"

Great thanks.

I've sent v4 series. Please review it.

If somebody else could do it and provide acks/reviewed by, that would
help,

Michael, can I get your comments as such?

It looks OK to me. But I don't know the code really, and I haven't had
time to test it personally.

Ravi, have you tested on a big endian machine?


Yes Michael, I've tested annotate on BE and LE both.

-Ravi



cheers





Re: [PATCH v4 0/3] perf annotate: Enable cross arch annotate

2016-07-13 Thread Ravi Bangoria

Arnaldo, Michael,

I've tested this patchset on ppc64 BE and LE both. Please review this.

-Ravi

On Friday 08 July 2016 10:10 AM, Ravi Bangoria wrote:

Perf can currently only support code navigation (branches and calls) in
annotate when run on the same architecture where perf.data was recorded.
But cross arch annotate is not supported.

This patchset enables cross arch annotate. Currently I've used x86
and arm instructions which are already available and adding support
for powerpc as well. Adding support for other arch will be easy.

I've created this patch on top of acme/perf/core. And tested it with
x86 and powerpc only.

Note for arm:
Few instructions were defined under #if __arm__ which I've used as a
table for arm. But I'm not sure whether instruction defined outside of
that also contains arm instructions. Apart from that, 'call__parse()'
and 'move__parse()' contains #ifdef __arm__ directive. I've changed it
to  if (!strcmp(norm_arch, arm)). I don't have a arm machine to test
these changes.

Example:

   Record on powerpc:
   $ ./perf record -a

   Report -> Annotate on x86:
   $ ./perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Changes in v4:
   - powerpc: Added support for branch instructions that includes 'ctr'
   - __maybe_unused was misplaced at few location. Corrected it.
   - Moved position of v3 last patch that define macro for each arch name

v3 link: https://lkml.org/lkml/2016/6/30/99

Naveen N. Rao (1):
   perf annotate: add powerpc support

Ravi Bangoria (2):
   perf: Define macro for normalized arch names
   perf annotate: Enable cross arch annotate

  tools/perf/arch/common.c   |  36 ++---
  tools/perf/arch/common.h   |  11 ++
  tools/perf/builtin-top.c   |   2 +-
  tools/perf/ui/browsers/annotate.c  |   3 +-
  tools/perf/ui/gtk/annotate.c   |   2 +-
  tools/perf/util/annotate.c | 273 ++---
  tools/perf/util/annotate.h |   6 +-
  tools/perf/util/unwind-libunwind.c |   4 +-
  8 files changed, 265 insertions(+), 72 deletions(-)

--
2.5.5





Re: [PATCH v3 3/4] perf annotate: add powerpc support

2016-07-11 Thread Ravi Bangoria

Hi Arnaldo,

On Friday 08 July 2016 02:01 PM, Michael Ellerman wrote:

Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> writes:


On Wednesday 06 July 2016 03:38 PM, Michael Ellerman wrote:

I've sent v4 which enables annotate for bctr' instructions.

for 'bctr', it will show down arrow(indicate jump) and 'bctrl' will show
right arrow(indicate call). But no navigation options will be provided.
By pressing Enter key on that, message will be shown that like
"Invalid target"

Great thanks.


I've sent v4 series. Please review it.

-Ravi


It doesn't look like we have the opcode handy here? Could we get it somehow?
That would make this a *lot* more robust.

objdump prints machine code, but I don't know how difficult that would
be to parse to get opcode.

Normal objdump -d output includes the opcode, eg:

c000886c:   2c 2c 00 00 cmpdi   r12,0
  ^^^

The only thing you need to know is the endian and you can reconstruct
the raw instruction.

Then you can just decode the opcode, see how we do it in the kernel with
eg. instr_is_relative_branch().

I'm sorry. I was thinking that you wants to show opcodes with perf
annotate. But you were asking to use opcode instead of parsing
instructions.

Yeah.


This looks like rewrite parsing code. I don't know whether there is any
library already available for this which we can directly use. I'm thinking
about this.

OK don't worry about it for now. We should get this merged for starters
and we can always improve it later.

cheers





Re: [PATCH v3 3/4] perf annotate: add powerpc support

2016-07-04 Thread Ravi Bangoria

Hi Michael,

On Friday 01 July 2016 02:13 PM, Ravi Bangoria wrote:

Thanks Michael for your suggestion.

On Thursday 30 June 2016 11:51 AM, Michael Ellerman wrote:

On Thu, 2016-06-30 at 11:44 +0530, Ravi Bangoria wrote:

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..b87eac7 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -476,6 +481,125 @@ static int ins__cmp(const void *a, const void *b)

...

+
+static struct ins *ins__find_powerpc(const char *name)
+{
+int i;
+struct ins *ins;
+struct ins_ops *ops;
+static struct instructions_powerpc head;
+static bool list_initialized;
+
+/*
+ * - Interested only if instruction starts with 'b'.
+ * - Few start with 'b', but aren't branch instructions.
+ * - Let's also ignore instructions involving 'ctr' and
+ *   'tar' since target branch addresses for those can't
+ *   be determined statically.
+ */
+if (name[0] != 'b' ||
+!strncmp(name, "bcd", 3)   ||
+!strncmp(name, "brinc", 5) ||
+!strncmp(name, "bper", 4)  ||
+strstr(name, "ctr")||
+strstr(name, "tar"))
+return NULL;
It would be good if 'bctr' was at least recognised as a branch, even 
if we

can't determine the target. They are very common.


We can not show arrow for this since we don't know the target location.
can you please suggest how you intends perf to display bctr?

bctr can be classified into two variants -- 'bctr' and 'bctrl'.

'bctr' will be considered as jump instruction but jump__parse() won't
be able to find any target location and hence it will set target to
UINT64_MAX which transform 'bctr' to 'bctr UINT64_MAX'. This
looks misleading.

bctrl will be considered as call instruction but call_parse() won't
be able to find any target function and hence it won't show any
navigation arrow for this instruction. Which is same as filter it
beforehand.

It doesn't look like we have the opcode handy here? Could we get it 
somehow?

That would make this a *lot* more robust.


objdump prints machine code, but I don't know how difficult that would
be to parse to get opcode.


Perf uses  --no-show-raw with objdump and hence objdump output does not
show opcodes. So change in current  objdump output may requires changes
in current parsing logic. Additionally I need to change tui as well to show
opcodes. This looks quite more work.

And this patchset is about enabling annotate for cross arch. So if you 
really

need opcode with perf anotate, can we do it separately?

Please let me know your thoughts.

-Ravi



-Ravi


cheers







[PATCH v4 1/3] perf: Define macro for normalized arch names

2016-07-07 Thread Ravi Bangoria
Define macro for each normalized arch name and use them instead
of using arch name as string

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v4:
  - Moved position of patch

 tools/perf/arch/common.c   | 36 ++--
 tools/perf/arch/common.h   | 11 +++
 tools/perf/util/unwind-libunwind.c |  4 ++--
 3 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index ee69668..feb2113 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -122,25 +122,25 @@ static int lookup_triplets(const char *const *triplets, 
const char *name)
 const char *normalize_arch(char *arch)
 {
if (!strcmp(arch, "x86_64"))
-   return "x86";
+   return NORM_X86;
if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-   return "x86";
+   return NORM_X86;
if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-   return "sparc";
+   return NORM_SPARC;
if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
-   return "arm64";
+   return NORM_ARM64;
if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-   return "arm";
+   return NORM_ARM;
if (!strncmp(arch, "s390", 4))
-   return "s390";
+   return NORM_S390;
if (!strncmp(arch, "parisc", 6))
-   return "parisc";
+   return NORM_PARISC;
if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-   return "powerpc";
+   return NORM_POWERPC;
if (!strncmp(arch, "mips", 4))
-   return "mips";
+   return NORM_MIPS;
if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-   return "sh";
+   return NORM_SH;
 
return arch;
 }
@@ -180,21 +180,21 @@ static int perf_env__lookup_binutils_path(struct perf_env 
*env,
zfree();
}
 
-   if (!strcmp(arch, "arm"))
+   if (!strcmp(arch, NORM_ARM))
path_list = arm_triplets;
-   else if (!strcmp(arch, "arm64"))
+   else if (!strcmp(arch, NORM_ARM64))
path_list = arm64_triplets;
-   else if (!strcmp(arch, "powerpc"))
+   else if (!strcmp(arch, NORM_POWERPC))
path_list = powerpc_triplets;
-   else if (!strcmp(arch, "sh"))
+   else if (!strcmp(arch, NORM_SH))
path_list = sh_triplets;
-   else if (!strcmp(arch, "s390"))
+   else if (!strcmp(arch, NORM_S390))
path_list = s390_triplets;
-   else if (!strcmp(arch, "sparc"))
+   else if (!strcmp(arch, NORM_SPARC))
path_list = sparc_triplets;
-   else if (!strcmp(arch, "x86"))
+   else if (!strcmp(arch, NORM_X86))
path_list = x86_triplets;
-   else if (!strcmp(arch, "mips"))
+   else if (!strcmp(arch, NORM_MIPS))
path_list = mips_triplets;
else {
ui__error("binutils for %s not supported.\n", arch);
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index 6b01c73..14ca8ca 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -5,6 +5,17 @@
 
 extern const char *objdump_path;
 
+/* Macro for normalized arch names */
+#define NORM_X86   "x86"
+#define NORM_SPARC "sparc"
+#define NORM_ARM64 "arm64"
+#define NORM_ARM   "arm"
+#define NORM_S390  "s390"
+#define NORM_PARISC"parisc"
+#define NORM_POWERPC   "powerpc"
+#define NORM_MIPS  "mips"
+#define NORM_SH"sh"
+
 int perf_env__lookup_objdump(struct perf_env *env);
 const char *normalize_arch(char *arch);
 
diff --git a/tools/perf/util/unwind-libunwind.c 
b/tools/perf/util/unwind-libunwind.c
index 6d542a4..6199102 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -40,10 +40,10 @@ int unwind__prepare_access(struct thread *thread, struct 
map *map,
 
arch = normalize_arch(thread->mg->machine->env->arch);
 
-   if (!strcmp(arch, "x86")) {
+   if (!strcmp(arch, NORM_X86)) {
if (dso_type != DSO__TYPE_64BIT)
ops = x86_32_unwind_libunwind_ops;
-   } else if (!strcmp(arch, "arm64") || !strcmp(arch, "arm")) {
+   } else if (!strcmp(arch, NORM_ARM64) || !strcmp(arch, NORM_ARM)) {
if (dso_type == DSO__TYPE_64BIT)
ops = arm64_unwind_libunwind_ops;
}
-- 
2.5.5



[PATCH v4 0/3] perf annotate: Enable cross arch annotate

2016-07-07 Thread Ravi Bangoria
Perf can currently only support code navigation (branches and calls) in
annotate when run on the same architecture where perf.data was recorded.
But cross arch annotate is not supported.

This patchset enables cross arch annotate. Currently I've used x86
and arm instructions which are already available and adding support
for powerpc as well. Adding support for other arch will be easy.

I've created this patch on top of acme/perf/core. And tested it with
x86 and powerpc only.

Note for arm:
Few instructions were defined under #if __arm__ which I've used as a
table for arm. But I'm not sure whether instruction defined outside of
that also contains arm instructions. Apart from that, 'call__parse()'
and 'move__parse()' contains #ifdef __arm__ directive. I've changed it
to  if (!strcmp(norm_arch, arm)). I don't have a arm machine to test
these changes.

Example:

  Record on powerpc:
  $ ./perf record -a

  Report -> Annotate on x86:
  $ ./perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Changes in v4:
  - powerpc: Added support for branch instructions that includes 'ctr'
  - __maybe_unused was misplaced at few location. Corrected it.
  - Moved position of v3 last patch that define macro for each arch name

v3 link: https://lkml.org/lkml/2016/6/30/99

Naveen N. Rao (1):
  perf annotate: add powerpc support

Ravi Bangoria (2):
  perf: Define macro for normalized arch names
  perf annotate: Enable cross arch annotate

 tools/perf/arch/common.c   |  36 ++---
 tools/perf/arch/common.h   |  11 ++
 tools/perf/builtin-top.c   |   2 +-
 tools/perf/ui/browsers/annotate.c  |   3 +-
 tools/perf/ui/gtk/annotate.c   |   2 +-
 tools/perf/util/annotate.c | 273 ++---
 tools/perf/util/annotate.h |   6 +-
 tools/perf/util/unwind-libunwind.c |   4 +-
 8 files changed, 265 insertions(+), 72 deletions(-)

--
2.5.5



Re: [PATCH v3 3/4] perf annotate: add powerpc support

2016-07-07 Thread Ravi Bangoria

Hi Michael,

On Wednesday 06 July 2016 03:38 PM, Michael Ellerman wrote:

Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> writes:


On Thursday 30 June 2016 11:51 AM, Michael Ellerman wrote:

On Thu, 2016-06-30 at 11:44 +0530, Ravi Bangoria wrote:

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..b87eac7 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -476,6 +481,125 @@ static int ins__cmp(const void *a, const void *b)

...

+
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+   struct ins_ops *ops;
+   static struct instructions_powerpc head;
+   static bool list_initialized;
+
+   /*
+* - Interested only if instruction starts with 'b'.
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (name[0] != 'b' ||
+   !strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;

It would be good if 'bctr' was at least recognised as a branch, even if we
can't determine the target. They are very common.

We can not show arrow for this since we don't know the target location.
can you please suggest how you intends perf to display bctr?

Yeah I understand you can't show an arrow.

I guess it could just be an unterminated arrow? But I'm not sure if
that's easy to do with the way the UI is constructed. eg. something
like:

 ld  r12,0(r12)
 mtctr   r12
 bctrl  -->
 ld  r3,-32704(r2)

But that's just an idea.


I've sent v4 which enables annotate for bctr' instructions.

for 'bctr', it will show down arrow(indicate jump) and 'bctrl' will show
right arrow(indicate call). But no navigation options will be provided.
By pressing Enter key on that, message will be shown that like
"Invalid target"

Please review it.


bctr can be classified into two variants -- 'bctr' and 'bctrl'.

'bctr' will be considered as jump instruction but jump__parse() won't
be able to find any target location and hence it will set target to
UINT64_MAX which transform 'bctr' to 'bctr UINT64_MAX'. This
looks misleading.

Agreed.


bctrl will be considered as call instruction but call_parse() won't
be able to find any target function and hence it won't show any
navigation arrow for this instruction. Which is same as filter it
beforehand.

OK.

Maybe what I'm asking for is an enhancement and can be done later.


It doesn't look like we have the opcode handy here? Could we get it somehow?
That would make this a *lot* more robust.

objdump prints machine code, but I don't know how difficult that would
be to parse to get opcode.

Normal objdump -d output includes the opcode, eg:

c000886c:   2c 2c 00 00 cmpdi   r12,0
 ^^^

The only thing you need to know is the endian and you can reconstruct
the raw instruction.

Then you can just decode the opcode, see how we do it in the kernel with
eg. instr_is_relative_branch().


I'm sorry. I was thinking that you wants to show opcodes with perf
annotate. But you were asking to use opcode instead of parsing
instructions.

This looks like rewrite parsing code. I don't know whether there is any
library already available for this which we can directly use. I'm thinking
about this.

- Ravi


cheers





[PATCH v4 2/3] perf annotate: Enable cross arch annotate

2016-07-07 Thread Ravi Bangoria
Change current data structures and function to enable cross arch
annotate.

Current implementation does not contain logic of record on one arch
and annotating on other. This remote annotate is partially possible
with current implementation for x86 (or may be arm as well) only.
But, to make remote annotation work properly, all architecture
instruction tables need to be included in the perf binary. And while
annotating, look for instruction table where perf.data was recorded.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v4:
  - __maybe_unused was misplaced at few location. Corrected it

 tools/perf/builtin-top.c  |   2 +-
 tools/perf/ui/browsers/annotate.c |   3 +-
 tools/perf/ui/gtk/annotate.c  |   2 +-
 tools/perf/util/annotate.c| 134 --
 tools/perf/util/annotate.h|   5 +-
 5 files changed, 93 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 07fc792..d4fd947 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -128,7 +128,7 @@ static int perf_top__parse_source(struct perf_top *top, 
struct hist_entry *he)
return err;
}
 
-   err = symbol__annotate(sym, map, 0);
+   err = symbol__annotate(sym, map, 0, NULL);
if (err == 0) {
 out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 29dc6d2..3a652a6f 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1050,7 +1050,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map 
*map,
  (nr_pcnt - 1);
}
 
-   if (symbol__annotate(sym, map, sizeof_bdl) < 0) {
+   if (symbol__annotate(sym, map, sizeof_bdl,
+perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__last_msg);
goto out_free_offsets;
}
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 9c7ff8d..d7150b3 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -166,7 +166,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct 
map *map,
if (map->dso->annotate_warned)
return -1;
 
-   if (symbol__annotate(sym, map, 0) < 0) {
+   if (symbol__annotate(sym, map, 0, perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__current);
return -1;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e9825fe..32889ce 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -20,12 +20,14 @@
 #include 
 #include 
 #include 
+#include 
+#include "../arch/common.h"
 
 const char *disassembler_style;
 const char *objdump_path;
 static regex_t  file_lineno;
 
-static struct ins *ins__find(const char *name);
+static struct ins *ins__find(const char *name, const char *norm_arch);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
 
 static void ins__delete(struct ins_operands *ops)
@@ -53,7 +55,7 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
 }
 
-static int call__parse(struct ins_operands *ops)
+static int call__parse(struct ins_operands *ops, const char *norm_arch)
 {
char *endptr, *tok, *name;
 
@@ -65,10 +67,8 @@ static int call__parse(struct ins_operands *ops)
 
name++;
 
-#ifdef __arm__
-   if (strchr(name, '+'))
+   if (!strcmp(norm_arch, NORM_ARM) && strchr(name, '+'))
return -1;
-#endif
 
tok = strchr(name, '>');
if (tok == NULL)
@@ -117,7 +117,8 @@ bool ins__is_call(const struct ins *ins)
return ins->ops == _ops;
 }
 
-static int jump__parse(struct ins_operands *ops)
+static int jump__parse(struct ins_operands *ops,
+  const char *norm_arch __maybe_unused)
 {
const char *s = strchr(ops->raw, '+');
 
@@ -172,7 +173,7 @@ static int comment__symbol(char *raw, char *comment, u64 
*addrp, char **namep)
return 0;
 }
 
-static int lock__parse(struct ins_operands *ops)
+static int lock__parse(struct ins_operands *ops, const char *norm_arch)
 {
char *name;
 
@@ -183,7 +184,7 @@ static int lock__parse(struct ins_operands *ops)
if (disasm_line__parse(ops->raw, , >locked.ops->raw) < 0)
goto out_free_ops;
 
-   ops->locked.ins = ins__find(name);
+   ops->locked.ins = ins__find(name, norm_arch);
free(name);
 
if (ops->locked.ins == NULL)
@@ -193,7 +194,7 @@ static int lock__parse(struct ins_operands *ops)
return 0;
 
if (ops->locked.ins->ops->parse &&
-   ops->locked.ins->ops->parse(ops->

[PATCH v4 3/3] perf annotate: add powerpc support

2016-07-07 Thread Ravi Bangoria
From: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

Powerpc has long list of branch instructions and hardcoding them in
table appears to be error-prone. So, add new function to find
instruction instead of creating table. This function dynamically
create table(list of 'struct ins'), and instead of creating object
every time, first check if list already contain object for that
instruction.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Chnages in v4:
  - Added support for branch instructions that includes 'ctr'

 tools/perf/util/annotate.c | 155 +++--
 tools/perf/util/annotate.h |   3 +-
 2 files changed, 150 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 32889ce..9de1271 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -55,10 +55,15 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
 }
 
-static int call__parse(struct ins_operands *ops, const char *norm_arch)
+static int call__parse(char *ins_name, struct ins_operands *ops,
+  const char *norm_arch)
 {
char *endptr, *tok, *name;
 
+   /* Special case for powerpc */
+   if (!strcmp(norm_arch, NORM_POWERPC) && strstr(ins_name, "ctr"))
+   return 0;
+
ops->target.addr = strtoull(ops->raw, , 16);
 
name = strchr(endptr, '<');
@@ -117,7 +122,7 @@ bool ins__is_call(const struct ins *ins)
return ins->ops == _ops;
 }
 
-static int jump__parse(struct ins_operands *ops,
+static int jump__parse(char *ins_name __maybe_unused, struct ins_operands *ops,
   const char *norm_arch __maybe_unused)
 {
const char *s = strchr(ops->raw, '+');
@@ -135,6 +140,13 @@ static int jump__parse(struct ins_operands *ops,
 static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
   struct ins_operands *ops)
 {
+   /*
+* Instructions that does not include target address in operand
+* like 'bctr' for powerpc.
+*/
+   if (!ops->target.addr)
+   return scnprintf(bf, size, "%-6.6s", ins->name);
+
return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
ops->target.offset);
 }
 
@@ -173,7 +185,8 @@ static int comment__symbol(char *raw, char *comment, u64 
*addrp, char **namep)
return 0;
 }
 
-static int lock__parse(struct ins_operands *ops, const char *norm_arch)
+static int lock__parse(char *ins_name, struct ins_operands *ops,
+  const char *norm_arch)
 {
char *name;
 
@@ -194,7 +207,8 @@ static int lock__parse(struct ins_operands *ops, const char 
*norm_arch)
return 0;
 
if (ops->locked.ins->ops->parse &&
-   ops->locked.ins->ops->parse(ops->locked.ops, norm_arch) < 0)
+   ops->locked.ins->ops->parse(ins_name,
+   ops->locked.ops, norm_arch) < 0)
goto out_free_ops;
 
return 0;
@@ -237,7 +251,8 @@ static struct ins_ops lock_ops = {
.scnprintf = lock__scnprintf,
 };
 
-static int mov__parse(struct ins_operands *ops, const char *norm_arch)
+static int mov__parse(char *ins_name __maybe_unused, struct ins_operands *ops,
+ const char *norm_arch)
 {
char *s = strchr(ops->raw, ','), *target, *comment, prev;
 
@@ -304,7 +319,7 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
 };
 
-static int dec__parse(struct ins_operands *ops,
+static int dec__parse(char *ins_name __maybe_unused, struct ins_operands *ops,
  const char *norm_arch __maybe_unused)
 {
char *target, *comment, *s, prev;
@@ -459,6 +474,11 @@ static struct ins instructions_arm[] = {
{ .name = "bne",   .ops  = _ops, },
 };
 
+struct instructions_powerpc {
+   struct ins *ins;
+   struct list_head list;
+};
+
 static int ins__key_cmp(const void *name, const void *insp)
 {
const struct ins *ins = insp;
@@ -474,6 +494,125 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
 }
 
+static struct ins *list_add__ins_powerpc(struct instructions_powerpc *head,
+const char *name, struct ins_ops *ops)
+{
+   struct instructions_powerpc *ins_powerpc;
+   struct ins *ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins_powerpc = zalloc(sizeof(struct instructions_powerpc));
+   if (!ins_powerpc)
+   goto out_free_ins;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   goto out_free_ins_power;
+
+   ins->ops = ops;
+

[PATCH 2/2] perf ppc64le: Fix probe location when using DWARF

2016-08-09 Thread Ravi Bangoria
Powerpc has Global Entry Point and Local Entry Point for functions.
LEP catches call from both the GEP and the LEP. Symbol table of ELF
contains GEP and Offset from which we can calculate LEP, but debuginfo
does not have LEP info.

Currently, perf prioritize symbol table over dwarf to probe on LEP
for ppc64le. But when user tries to probe with function parameter,
we fall back to using dwarf(i.e. GEP) and when function called via
LEP, probe will never hit.

For example:
  $ objdump -d vmlinux
...
do_sys_open():
c02eb4a0:   e8 00 4c 3c addis   r2,r12,232
c02eb4a4:   60 00 42 38 addir2,r2,96
c02eb4a8:   a6 02 08 7c mflrr0
c02eb4ac:   d0 ff 41 fb std r26,-48(r1)

  $ sudo ./perf probe do_sys_open
  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_sys_open _text+3060904

  $ sudo ./perf probe 'do_sys_open filename:string'
  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_sys_open _text+3060896 filename_string=+0(%gpr4):string

For second case, perf probed on GEP. So when function will be called
via LEP, probe won't hit.

  $ sudo ./perf record -a -e probe:do_sys_open ls
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.195 MB perf.data ]

To resolve this issue, let's not prioritize symbol table, let perf
decide what it wants to use. Perf is already converting GEP to LEP
when it uses symbol table. When perf uses debuginfo, let it find
LEP offset form symbol table. This way we fall back to probe on LEP
for all cases.

After patch:
  $ sudo ./perf probe 'do_sys_open filename:string'
  $ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_sys_open _text+3060904 filename_string=+0(%gpr4):string

  $ sudo ./perf record -a -e probe:do_sys_open ls
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.197 MB perf.data (11 samples) ]

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/sym-handling.c | 27 +
 tools/perf/util/probe-event.c   | 37 -
 tools/perf/util/probe-event.h   |  6 -
 3 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/sym-handling.c 
b/tools/perf/arch/powerpc/util/sym-handling.c
index c6d0f91..8d4dc97 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -54,10 +54,6 @@ int arch__compare_symbol_names(const char *namea, const char 
*nameb)
 #endif
 
 #if defined(_CALL_ELF) && _CALL_ELF == 2
-bool arch__prefers_symtab(void)
-{
-   return true;
-}
 
 #ifdef HAVE_LIBELF_SUPPORT
 void arch__sym_update(struct symbol *s, GElf_Sym *sym)
@@ -100,4 +96,27 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,
tev->point.offset += lep_offset;
}
 }
+
+void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
+  int ntevs)
+{
+   struct probe_trace_event *tev;
+   struct map *map;
+   struct symbol *sym = NULL;
+   struct rb_node *tmp;
+   int i = 0;
+
+   map = get_target_map(pev->target, pev->uprobes);
+   if (!map || map__load(map, NULL) < 0)
+   return;
+
+   for (i = 0; i < ntevs; i++) {
+   tev = >tevs[i];
+   map__for_each_symbol(map, sym, tmp) {
+   if (map->unmap_ip(map, sym->start) == 
tev->point.address)
+   arch__fix_tev_from_maps(pev, tev, map, sym);
+   }
+   }
+}
+
 #endif
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4e215e7..5efa535 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -178,7 +178,7 @@ static struct map *kernel_get_module_map(const char *module)
return NULL;
 }
 
-static struct map *get_target_map(const char *target, bool user)
+struct map *get_target_map(const char *target, bool user)
 {
/* Init maps of given executable or kernel */
if (user)
@@ -703,19 +703,32 @@ post_process_kernel_probe_trace_events(struct 
probe_trace_event *tevs,
return skipped;
 }
 
+void __weak
+arch__post_process_probe_trace_events(struct perf_probe_event *pev 
__maybe_unused,
+ int ntevs __maybe_unused)
+{
+}
+
 /* Post processing the probe events */
-static int post_process_probe_trace_events(struct probe_trace_event *tevs,
+static int post_process_probe_trace_events(struct perf_probe_event *pev,
+  struct probe_trace_event *tevs,
   int ntevs, const char *module,
   bool uprobe)
 {
-   if (uprobe)
-   return add_exec_to_probe_trace_eve

[PATCH 1/2] perf: Add function to post process kernel trace events

2016-08-09 Thread Ravi Bangoria
Instead of inline code, introduce function to post process kernel
probe trace events.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 953dc1a..4e215e7 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -664,22 +664,14 @@ static int add_module_to_probe_trace_events(struct 
probe_trace_event *tevs,
return ret;
 }
 
-/* Post processing the probe events */
-static int post_process_probe_trace_events(struct probe_trace_event *tevs,
-  int ntevs, const char *module,
-  bool uprobe)
+static int
+post_process_kernel_probe_trace_events(struct probe_trace_event *tevs,
+  int ntevs)
 {
struct ref_reloc_sym *reloc_sym;
char *tmp;
int i, skipped = 0;
 
-   if (uprobe)
-   return add_exec_to_probe_trace_events(tevs, ntevs, module);
-
-   /* Note that currently ref_reloc_sym based probe is not for drivers */
-   if (module)
-   return add_module_to_probe_trace_events(tevs, ntevs, module);
-
reloc_sym = kernel_get_ref_reloc_sym();
if (!reloc_sym) {
pr_warning("Relocated base symbol is not found!\n");
@@ -711,6 +703,21 @@ static int post_process_probe_trace_events(struct 
probe_trace_event *tevs,
return skipped;
 }
 
+/* Post processing the probe events */
+static int post_process_probe_trace_events(struct probe_trace_event *tevs,
+  int ntevs, const char *module,
+  bool uprobe)
+{
+   if (uprobe)
+   return add_exec_to_probe_trace_events(tevs, ntevs, module);
+
+   if (module)
+   /* Currently ref_reloc_sym based probe is not for drivers */
+   return add_module_to_probe_trace_events(tevs, ntevs, module);
+
+   return post_process_kernel_probe_trace_events(tevs, ntevs);
+}
+
 /* Try to find perf_probe_event with debuginfo */
 static int try_to_find_probe_trace_events(struct perf_probe_event *pev,
  struct probe_trace_event **tevs)
-- 
2.7.4



[PATCH v3 1/4] perf: Utility function to fetch arch

2016-06-30 Thread Ravi Bangoria
Add Utility function to fetch arch using evsel. (evsel->env->arch)

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Change in v3:
  - No changes

 tools/perf/util/evsel.c | 7 +++
 tools/perf/util/evsel.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1d8f2bb..0fea724 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2422,3 +2422,10 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
 err, strerror_r(err, sbuf, sizeof(sbuf)),
 perf_evsel__name(evsel));
 }
+
+char *perf_evsel__env_arch(struct perf_evsel *evsel)
+{
+   if (evsel && evsel->evlist && evsel->evlist->env)
+   return evsel->evlist->env->arch;
+   return NULL;
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 828ddd1..86fed7a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -435,4 +435,6 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const 
char *, void *);
 int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 attr__fprintf_f attr__fprintf, void *priv);
 
+char *perf_evsel__env_arch(struct perf_evsel *evsel);
+
 #endif /* __PERF_EVSEL_H */
-- 
2.5.5



[PATCH v3 0/4] perf annotate: Enable cross arch annotate

2016-06-30 Thread Ravi Bangoria
Perf can currently only support code navigation (branches and calls) in
annotate when run on the same architecture where perf.data was recorded.
But cross arch annotate is not supported.

This patchset enables cross arch annotate. Currently I've used x86
and arm instructions which are already available and adding support
for powerpc as well. Adding support for other arch will be easy.

I've created this patch on top of acme/perf/core. And tested it with
x86 and powerpc only.

Example:

  Record on powerpc:
  $ ./perf record -a

  Report -> Annotate on x86:
  $ ./perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Changes in v3:
  - Optimized patch that enables annotate on powerpc
  - Corrected one memory leak

v2 link: https://lkml.org/lkml/2016/6/29/278

Naveen N. Rao (1):
  perf annotate: add powerpc support

Ravi Bangoria (4):
  perf: Utility function to fetch arch
  perf annotate: Enable cross arch annotate
  perf: Define macro for normalized arch names

 tools/perf/arch/common.c   |  36 ++---
 tools/perf/arch/common.h   |  11 ++
 tools/perf/builtin-top.c   |   2 +-
 tools/perf/ui/browsers/annotate.c  |   3 +-
 tools/perf/ui/gtk/annotate.c   |   2 +-
 tools/perf/util/annotate.c | 260 ++---
 tools/perf/util/annotate.h |   5 +-
 tools/perf/util/evsel.c|   7 +
 tools/perf/util/evsel.h|   2 +
 tools/perf/util/unwind-libunwind.c |   4 +-
 10 files changed, 260 insertions(+), 72 deletions(-)

--
2.5.5



[PATCH v3 2/4] perf annotate: Enable cross arch annotate

2016-06-30 Thread Ravi Bangoria
Change current data structures and function to enable cross arch
annotate.

Current implementation does not contain logic of record on one arch
and annotating on other. This remote annotate is partially possible
with current implementation for x86 (or may be arm as well) only.
But, to make remote annotation work properly, all architecture
instruction tables need to be included in the perf binary. And while
annotating, look for instruction table where perf.data was recorded.

For arm, few instructions were defined under #if __arm__ which I've
used as a table for arm. But I'm not sure whether instruction defined
outside of that also contains arm instructions. Apart from that,
'call__parse()' and 'move__parse()' contains #ifdef __arm__ directive.
I've changed it to  if (!strcmp(norm_arch, arm)). But I've not
tested this as well.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v3:
  - No changes

 tools/perf/builtin-top.c  |   2 +-
 tools/perf/ui/browsers/annotate.c |   3 +-
 tools/perf/ui/gtk/annotate.c  |   2 +-
 tools/perf/util/annotate.c| 136 --
 tools/perf/util/annotate.h|   5 +-
 5 files changed, 95 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 07fc792..d4fd947 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -128,7 +128,7 @@ static int perf_top__parse_source(struct perf_top *top, 
struct hist_entry *he)
return err;
}
 
-   err = symbol__annotate(sym, map, 0);
+   err = symbol__annotate(sym, map, 0, NULL);
if (err == 0) {
 out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 29dc6d2..3a652a6f 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1050,7 +1050,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map 
*map,
  (nr_pcnt - 1);
}
 
-   if (symbol__annotate(sym, map, sizeof_bdl) < 0) {
+   if (symbol__annotate(sym, map, sizeof_bdl,
+perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__last_msg);
goto out_free_offsets;
}
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 9c7ff8d..d7150b3 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -166,7 +166,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct 
map *map,
if (map->dso->annotate_warned)
return -1;
 
-   if (symbol__annotate(sym, map, 0) < 0) {
+   if (symbol__annotate(sym, map, 0, perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__current);
return -1;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index c385fec..36a5825 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -20,12 +20,14 @@
 #include 
 #include 
 #include 
+#include 
+#include "../arch/common.h"
 
 const char *disassembler_style;
 const char *objdump_path;
 static regex_t  file_lineno;
 
-static struct ins *ins__find(const char *name);
+static struct ins *ins__find(const char *name, const char *norm_arch);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
 
 static void ins__delete(struct ins_operands *ops)
@@ -53,7 +55,8 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
 }
 
-static int call__parse(struct ins_operands *ops)
+static int call__parse(struct ins_operands *ops,
+  __maybe_unused const char *norm_arch)
 {
char *endptr, *tok, *name;
 
@@ -65,10 +68,8 @@ static int call__parse(struct ins_operands *ops)
 
name++;
 
-#ifdef __arm__
-   if (strchr(name, '+'))
+   if (!strcmp(norm_arch, "arm") && strchr(name, '+'))
return -1;
-#endif
 
tok = strchr(name, '>');
if (tok == NULL)
@@ -117,7 +118,8 @@ bool ins__is_call(const struct ins *ins)
return ins->ops == _ops;
 }
 
-static int jump__parse(struct ins_operands *ops)
+static int jump__parse(struct ins_operands *ops,
+  __maybe_unused const char *norm_arch)
 {
const char *s = strchr(ops->raw, '+');
 
@@ -172,7 +174,7 @@ static int comment__symbol(char *raw, char *comment, u64 
*addrp, char **namep)
return 0;
 }
 
-static int lock__parse(struct ins_operands *ops)
+static int lock__parse(struct ins_operands *ops, const char *norm_arch)
 {
char *name;
 
@@ -183,7 +185,7 @@ static int lock__parse(struct ins_operands *ops)
if (disasm_line__parse(ops->raw, , >locked.ops->raw) < 0)
goto out_free_ops;
 
-   ops->locked.ins = ins__find

[PATCH v3 3/4] perf annotate: add powerpc support

2016-06-30 Thread Ravi Bangoria
From: Naveen N. Rao <naveen.n@linux.vnet.ibm.com> 

Powerpc has long list of branch instructions and hardcoding them in
table appears to be error-prone. So, add new function to find
instruction instead of creating table. This function dynamically
create table(list of 'struct ins'), and instead of creating object
every time, first check if list already contain object for that
instruction.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v3:
  - Optimized code
  - Corrected one memory leak

 tools/perf/util/annotate.c | 126 +
 1 file changed, 126 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..b87eac7 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -461,6 +461,11 @@ static struct ins instructions_arm[] = {
{ .name = "bne",   .ops  = _ops, },
 };
 
+struct instructions_powerpc {
+   struct ins *ins;
+   struct list_head list;
+};
+
 static int ins__key_cmp(const void *name, const void *insp)
 {
const struct ins *ins = insp;
@@ -476,6 +481,125 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
 }
 
+static struct ins *list_add__ins_powerpc(struct instructions_powerpc *head,
+const char *name, struct ins_ops *ops)
+{
+   struct instructions_powerpc *ins_powerpc;
+   struct ins *ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins_powerpc = zalloc(sizeof(struct instructions_powerpc));
+   if (!ins_powerpc)
+   goto out_free_ins;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   goto out_free_ins_power;
+
+   ins->ops = ops;
+   ins_powerpc->ins = ins;
+   list_add_tail(&(ins_powerpc->list), &(head->list));
+
+   return ins;
+
+out_free_ins_power:
+   zfree(_powerpc);
+out_free_ins:
+   zfree();
+   return NULL;
+}
+
+static struct ins *list_search__ins_powerpc(struct instructions_powerpc *head,
+   const char *name)
+{
+   struct instructions_powerpc *pos;
+
+   list_for_each_entry(pos, >list, list) {
+   if (!strcmp(pos->ins->name, name))
+   return pos->ins;
+   }
+   return NULL;
+}
+
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+   struct ins_ops *ops;
+   static struct instructions_powerpc head;
+   static bool list_initialized;
+
+   /*
+* - Interested only if instruction starts with 'b'.
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (name[0] != 'b' ||
+   !strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;
+
+   if (!list_initialized) {
+   INIT_LIST_HEAD();
+   list_initialized = true;
+   }
+
+   /*
+* Return if we already have object of 'struct ins' for this
+* instruction
+*/
+   ins = list_search__ins_powerpc(, name);
+   if (ins)
+   return ins;
+
+   ops = _ops;
+
+   i = strlen(name) - 1;
+   if (i < 0)
+   return NULL;
+
+   /* ignore optional hints at the end of the instructions */
+   if (name[i] == '+' || name[i] == '-')
+   i--;
+
+   if (name[i] == 'l' || (name[i] == 'a' && name[i-1] == 'l')) {
+   /*
+* if the instruction ends up with 'l' or 'la', then
+* those are considered 'calls' since they update LR.
+* ... except for 'bnl' which is branch if not less than
+* and the absolute form of the same.
+*/
+   if (strcmp(name, "bnl") && strcmp(name, "bnl+") &&
+   strcmp(name, "bnl-") && strcmp(name, "bnla") &&
+   strcmp(name, "bnla+") && strcmp(name, "bnla-"))
+   ops = _ops;
+   }
+   if (name[i] == 'r' && name[i-1] == 'l')
+   /*
+* instructions ending with 'lr' are considered to be
+* return instructions
+*/
+   ops = _ops;
+
+   /*
+* Add instruction to list so next tim

[PATCH v3 4/4] perf: Define macro for normalized arch names

2016-06-30 Thread Ravi Bangoria
Define macro for each normalized arch name and use them instead
of using arch name as string

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v3:
  - No changes

 tools/perf/arch/common.c   | 36 ++--
 tools/perf/arch/common.h   | 11 +++
 tools/perf/util/annotate.c | 10 +-
 tools/perf/util/unwind-libunwind.c |  4 ++--
 4 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index ee69668..feb2113 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -122,25 +122,25 @@ static int lookup_triplets(const char *const *triplets, 
const char *name)
 const char *normalize_arch(char *arch)
 {
if (!strcmp(arch, "x86_64"))
-   return "x86";
+   return NORM_X86;
if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-   return "x86";
+   return NORM_X86;
if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-   return "sparc";
+   return NORM_SPARC;
if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
-   return "arm64";
+   return NORM_ARM64;
if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-   return "arm";
+   return NORM_ARM;
if (!strncmp(arch, "s390", 4))
-   return "s390";
+   return NORM_S390;
if (!strncmp(arch, "parisc", 6))
-   return "parisc";
+   return NORM_PARISC;
if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-   return "powerpc";
+   return NORM_POWERPC;
if (!strncmp(arch, "mips", 4))
-   return "mips";
+   return NORM_MIPS;
if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-   return "sh";
+   return NORM_SH;
 
return arch;
 }
@@ -180,21 +180,21 @@ static int perf_env__lookup_binutils_path(struct perf_env 
*env,
zfree();
}
 
-   if (!strcmp(arch, "arm"))
+   if (!strcmp(arch, NORM_ARM))
path_list = arm_triplets;
-   else if (!strcmp(arch, "arm64"))
+   else if (!strcmp(arch, NORM_ARM64))
path_list = arm64_triplets;
-   else if (!strcmp(arch, "powerpc"))
+   else if (!strcmp(arch, NORM_POWERPC))
path_list = powerpc_triplets;
-   else if (!strcmp(arch, "sh"))
+   else if (!strcmp(arch, NORM_SH))
path_list = sh_triplets;
-   else if (!strcmp(arch, "s390"))
+   else if (!strcmp(arch, NORM_S390))
path_list = s390_triplets;
-   else if (!strcmp(arch, "sparc"))
+   else if (!strcmp(arch, NORM_SPARC))
path_list = sparc_triplets;
-   else if (!strcmp(arch, "x86"))
+   else if (!strcmp(arch, NORM_X86))
path_list = x86_triplets;
-   else if (!strcmp(arch, "mips"))
+   else if (!strcmp(arch, NORM_MIPS))
path_list = mips_triplets;
else {
ui__error("binutils for %s not supported.\n", arch);
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index 6b01c73..14ca8ca 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -5,6 +5,17 @@
 
 extern const char *objdump_path;
 
+/* Macro for normalized arch names */
+#define NORM_X86   "x86"
+#define NORM_SPARC "sparc"
+#define NORM_ARM64 "arm64"
+#define NORM_ARM   "arm"
+#define NORM_S390  "s390"
+#define NORM_PARISC"parisc"
+#define NORM_POWERPC   "powerpc"
+#define NORM_MIPS  "mips"
+#define NORM_SH"sh"
+
 int perf_env__lookup_objdump(struct perf_env *env);
 const char *normalize_arch(char *arch);
 
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index b87eac7..fce60b4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -68,7 +68,7 @@ static int call__parse(struct ins_operands *ops,
 
name++;
 
-   if (!strcmp(norm_arch, "arm") && strchr(name, '+'))
+   if (!strcmp(norm_arch, NORM_ARM) && strchr(name, '+'))
return -1;
 
tok = strchr(name, '>');
@@ -255,7 +255,7 @@ static int mov__parse(struct ins_operands *ops,
 
target = ++s;
 
-   if (!strcmp(norm_arch, "arm"))
+   if (!strcmp(norm_arch, NORM_ARM))
comment = strchr(s, 

Re: [PATCH v3 3/4] perf annotate: add powerpc support

2016-07-01 Thread Ravi Bangoria

Thanks Michael for your suggestion.

On Thursday 30 June 2016 11:51 AM, Michael Ellerman wrote:

On Thu, 2016-06-30 at 11:44 +0530, Ravi Bangoria wrote:

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..b87eac7 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -476,6 +481,125 @@ static int ins__cmp(const void *a, const void *b)

...

+
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+   struct ins_ops *ops;
+   static struct instructions_powerpc head;
+   static bool list_initialized;
+
+   /*
+* - Interested only if instruction starts with 'b'.
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (name[0] != 'b' ||
+   !strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;

It would be good if 'bctr' was at least recognised as a branch, even if we
can't determine the target. They are very common.


We can not show arrow for this since we don't know the target location.
can you please suggest how you intends perf to display bctr?

bctr can be classified into two variants -- 'bctr' and 'bctrl'.

'bctr' will be considered as jump instruction but jump__parse() won't
be able to find any target location and hence it will set target to
UINT64_MAX which transform 'bctr' to 'bctr UINT64_MAX'. This
looks misleading.

bctrl will be considered as call instruction but call_parse() won't
be able to find any target function and hence it won't show any
navigation arrow for this instruction. Which is same as filter it
beforehand.


It doesn't look like we have the opcode handy here? Could we get it somehow?
That would make this a *lot* more robust.


objdump prints machine code, but I don't know how difficult that would
be to parse to get opcode.

-Ravi


cheers





Re: [PATCH v3 3/4] perf annotate: add powerpc support

2016-07-01 Thread Ravi Bangoria

Hi Balbir,

On Friday 01 July 2016 06:18 PM, Balbir Singh wrote:

On Fri, 2016-07-01 at 14:13 +0530, Ravi Bangoria wrote:

Thanks Michael for your suggestion.
  
On Thursday 30 June 2016 11:51 AM, Michael Ellerman wrote:
  
On Thu, 2016-06-30 at 11:44 +0530, Ravi Bangoria wrote:
  
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c

index 36a5825..b87eac7 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -476,6 +481,125 @@ static int ins__cmp(const void *a, const void *b)

...
  
+

+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+   struct ins_ops *ops;
+   static struct instructions_powerpc head;
+   static bool list_initialized;
+
+   /*
+* - Interested only if instruction starts with 'b'.
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (name[0] != 'b' ||
+   !strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   return NULL;

It would be good if 'bctr' was at least recognised as a branch, even if we
can't determine the target. They are very common.

We can not show arrow for this since we don't know the target location.
can you please suggest how you intends perf to display bctr?
  
bctr can be classified into two variants -- 'bctr' and 'bctrl'.
  
'bctr' will be considered as jump instruction but jump__parse() won't

be able to find any target location and hence it will set target to
UINT64_MAX which transform 'bctr' to 'bctr UINT64_MAX'. This
looks misleading.
  
bctrl will be considered as call instruction but call_parse() won't

be able to find any target function and hence it won't show any
navigation arrow for this instruction. Which is same as filter it
beforehand.


The target location and function are in the counter. Can't we add
this to instruction ops? Is it a major change to add it?


Of course we can add it.

What I mean is we can not determine target location statically by parsing
objdump output. For example, consider snippet:

objdump output:

  c0143848:   lwarx   r8,0,r10
  c014384c:   addic   r8,r8,1
  c0143850:   stwcx.  r8,0,r10
  c0143854:   bne-c0143848 <.rcu_idle_exit+0x58>

corresponding perf annotate output:

  58:  lwarx  r8,0,r10
 addic  r8,r8,1
 stwcx. r8,0,r10
 bne-   58

tui will show up arrow before 'bne- 58' instruction, that indicate it as
a jump instruction. When we focus on 'bne- 58' instruction, arrow will
span from that instruction to instruction with 58th offset( lwarx ).
By pressing Enter, it will jump focus to the target.

In case of 'bctr', we can not determine target location statically
and hence we can not provide any navigation options. Same for
'bctrl' as well.

Please correct me if I misunderstood anything.

-Ravi

  
Balbir Singh.






Re: [PATCH v2 3/4] perf annotate: add powerpc support

2016-06-29 Thread Ravi Bangoria

Thanks Naveen,

On Wednesday 29 June 2016 08:15 PM, Naveen N. Rao wrote:

On 2016/06/29 04:45PM, Ravi Bangoria wrote:

From: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

Powerpc has long list of branch instructions and hardcoding them in
table appears to be error-prone. So, add new function to find
instruction instead of creating table. This function dynamically
create table(list of 'struct ins'), and instead of creating object
every time, first check if list already contain object for that
nemonics.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
   - Corrected few memory leaks.
   - Created Dynamic list for powerpc to optimize memory consumption

  tools/perf/util/annotate.c | 121 +
  1 file changed, 121 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..812bfad 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -461,6 +461,11 @@ static struct ins instructions_arm[] = {
{ .name = "bne",   .ops  = _ops, },
  };

+struct instructions_powerpc {
+   struct ins *ins;
+   struct list_head list;
+};
+
  static int ins__key_cmp(const void *name, const void *insp)
  {
const struct ins *ins = insp;
@@ -476,6 +481,120 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
  }

+static int list_add__ins_powerpc(struct instructions_powerpc *head,
+struct ins *ins)
+{
+   struct instructions_powerpc *ins_powerpc;
+
+   ins_powerpc = zalloc(sizeof(struct instructions_powerpc));
+   if (!ins_powerpc)
+   return -1;
+
+   ins_powerpc->ins = ins;
+   list_add_tail(&(ins_powerpc->list), &(head->list));
+
+   return 0;
+}
+
+static struct ins *list_search__ins_powerpc(struct instructions_powerpc *head,
+   const char *name)
+{
+   struct instructions_powerpc *pos;
+
+   list_for_each_entry(pos, >list, list) {
+   if (!strcmp(pos->ins->name, name))
+   return pos->ins;
+   }
+   return NULL;
+}
+
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+   static struct instructions_powerpc head;
+   static bool list_initialized;
+
+   if (!list_initialized) {
+   INIT_LIST_HEAD();
+   list_initialized = true;
+   }
+
+   /*
+* Search if we already created object of 'struct ins'
+* for this instruction
+*/
+   ins = list_search__ins_powerpc(, name);
+   if (ins)
+   return ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   goto err;

You can move the above two inside the below if condition, so that you
only allocate memory if needed.

Or, what would be better would be to pass 'name' and the appropriate ops
pointer to the helper above (list_add__ins_powerpc) and have that
allocate 'struct ins' and insert into the list.


Yes I will think about this.


+
+   if (name[0] == 'b') {
+   /* branch instructions */
+   ins->ops = _ops;
+
+   /*
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (!strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   goto err;

You are still leaking ins->name here.


Ah!! Sorry. I missed that we are using strdup here. Will correct it.

-Ravi



[PATCH v2 3/4] perf annotate: add powerpc support

2016-06-29 Thread Ravi Bangoria
From: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>

Powerpc has long list of branch instructions and hardcoding them in
table appears to be error-prone. So, add new function to find
instruction instead of creating table. This function dynamically
create table(list of 'struct ins'), and instead of creating object
every time, first check if list already contain object for that
nemonics.

Signed-off-by: Naveen N. Rao <naveen.n@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - Corrected few memory leaks.
  - Created Dynamic list for powerpc to optimize memory consumption

 tools/perf/util/annotate.c | 121 +
 1 file changed, 121 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 36a5825..812bfad 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -461,6 +461,11 @@ static struct ins instructions_arm[] = {
{ .name = "bne",   .ops  = _ops, },
 };
 
+struct instructions_powerpc {
+   struct ins *ins;
+   struct list_head list;
+};
+
 static int ins__key_cmp(const void *name, const void *insp)
 {
const struct ins *ins = insp;
@@ -476,6 +481,120 @@ static int ins__cmp(const void *a, const void *b)
return strcmp(ia->name, ib->name);
 }
 
+static int list_add__ins_powerpc(struct instructions_powerpc *head,
+struct ins *ins)
+{
+   struct instructions_powerpc *ins_powerpc;
+
+   ins_powerpc = zalloc(sizeof(struct instructions_powerpc));
+   if (!ins_powerpc)
+   return -1;
+
+   ins_powerpc->ins = ins;
+   list_add_tail(&(ins_powerpc->list), &(head->list));
+
+   return 0;
+}
+
+static struct ins *list_search__ins_powerpc(struct instructions_powerpc *head,
+   const char *name)
+{
+   struct instructions_powerpc *pos;
+
+   list_for_each_entry(pos, >list, list) {
+   if (!strcmp(pos->ins->name, name))
+   return pos->ins;
+   }
+   return NULL;
+}
+
+static struct ins *ins__find_powerpc(const char *name)
+{
+   int i;
+   struct ins *ins;
+   static struct instructions_powerpc head;
+   static bool list_initialized;
+
+   if (!list_initialized) {
+   INIT_LIST_HEAD();
+   list_initialized = true;
+   }
+
+   /*
+* Search if we already created object of 'struct ins'
+* for this instruction
+*/
+   ins = list_search__ins_powerpc(, name);
+   if (ins)
+   return ins;
+
+   ins = zalloc(sizeof(struct ins));
+   if (!ins)
+   return NULL;
+
+   ins->name = strdup(name);
+   if (!ins->name)
+   goto err;
+
+   if (name[0] == 'b') {
+   /* branch instructions */
+   ins->ops = _ops;
+
+   /*
+* - Few start with 'b', but aren't branch instructions.
+* - Let's also ignore instructions involving 'ctr' and
+*   'tar' since target branch addresses for those can't
+*   be determined statically.
+*/
+   if (!strncmp(name, "bcd", 3)   ||
+   !strncmp(name, "brinc", 5) ||
+   !strncmp(name, "bper", 4)  ||
+   strstr(name, "ctr")||
+   strstr(name, "tar"))
+   goto err;
+
+   i = strlen(name) - 1;
+   if (i < 0)
+   goto err;
+
+   /* ignore optional hints at the end of the instructions */
+   if (name[i] == '+' || name[i] == '-')
+   i--;
+
+   if (name[i] == 'l' || (name[i] == 'a' && name[i-1] == 'l')) {
+   /*
+* if the instruction ends up with 'l' or 'la', then
+* those are considered 'calls' since they update LR.
+* ... except for 'bnl' which is branch if not less than
+* and the absolute form of the same.
+*/
+   if (strcmp(name, "bnl") && strcmp(name, "bnl+") &&
+   strcmp(name, "bnl-") && strcmp(name, "bnla") &&
+   strcmp(name, "bnla+") && strcmp(name, "bnla-"))
+   ins->ops = _ops;
+   }
+   if (name[i] == 'r' && name[i-1] == 'l')
+   /*
+* instructions ending with 'lr' are considered to be
+* return instructions
+*/

[PATCH v2 2/4] perf annotate: Enable cross arch annotate

2016-06-29 Thread Ravi Bangoria
Change current data structures and function to enable cross arch
annotate.

Current implementation does not contain logic of record on one arch
and annotating on other. This remote annotate is partially possible
with current implementation for x86 (or may be arm as well) only.
But, to make remote annotation work properly, all architecture
instruction tables need to be included in the perf binary. And while
annotating, look for instruction table where perf.data was recorded.

For arm, few instructions were defined under #if __arm__ which I've
used as a table for arm. But I'm not sure whether instruction defined
outside of that also contains arm instructions. Apart from that,
'call__parse()' and 'move__parse()' contains #ifdef __arm__ directive.
I've changed it to  if (!strcmp(norm_arch, "arm")). But I've not
tested this as well.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - No changes

 tools/perf/builtin-top.c  |   2 +-
 tools/perf/ui/browsers/annotate.c |   3 +-
 tools/perf/ui/gtk/annotate.c  |   2 +-
 tools/perf/util/annotate.c| 136 --
 tools/perf/util/annotate.h|   5 +-
 5 files changed, 95 insertions(+), 53 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 07fc792..d4fd947 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -128,7 +128,7 @@ static int perf_top__parse_source(struct perf_top *top, 
struct hist_entry *he)
return err;
}
 
-   err = symbol__annotate(sym, map, 0);
+   err = symbol__annotate(sym, map, 0, NULL);
if (err == 0) {
 out_assign:
top->sym_filter_entry = he;
diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 29dc6d2..3a652a6f 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -1050,7 +1050,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map 
*map,
  (nr_pcnt - 1);
}
 
-   if (symbol__annotate(sym, map, sizeof_bdl) < 0) {
+   if (symbol__annotate(sym, map, sizeof_bdl,
+perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__last_msg);
goto out_free_offsets;
}
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 9c7ff8d..d7150b3 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -166,7 +166,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct 
map *map,
if (map->dso->annotate_warned)
return -1;
 
-   if (symbol__annotate(sym, map, 0) < 0) {
+   if (symbol__annotate(sym, map, 0, perf_evsel__env_arch(evsel)) < 0) {
ui__error("%s", ui_helpline__current);
return -1;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index c385fec..36a5825 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -20,12 +20,14 @@
 #include 
 #include 
 #include 
+#include 
+#include "../arch/common.h"
 
 const char *disassembler_style;
 const char *objdump_path;
 static regex_t  file_lineno;
 
-static struct ins *ins__find(const char *name);
+static struct ins *ins__find(const char *name, const char *norm_arch);
 static int disasm_line__parse(char *line, char **namep, char **rawp);
 
 static void ins__delete(struct ins_operands *ops)
@@ -53,7 +55,8 @@ int ins__scnprintf(struct ins *ins, char *bf, size_t size,
return ins__raw_scnprintf(ins, bf, size, ops);
 }
 
-static int call__parse(struct ins_operands *ops)
+static int call__parse(struct ins_operands *ops,
+  __maybe_unused const char *norm_arch)
 {
char *endptr, *tok, *name;
 
@@ -65,10 +68,8 @@ static int call__parse(struct ins_operands *ops)
 
name++;
 
-#ifdef __arm__
-   if (strchr(name, '+'))
+   if (!strcmp(norm_arch, "arm") && strchr(name, '+'))
return -1;
-#endif
 
tok = strchr(name, '>');
if (tok == NULL)
@@ -117,7 +118,8 @@ bool ins__is_call(const struct ins *ins)
return ins->ops == _ops;
 }
 
-static int jump__parse(struct ins_operands *ops)
+static int jump__parse(struct ins_operands *ops,
+  __maybe_unused const char *norm_arch)
 {
const char *s = strchr(ops->raw, '+');
 
@@ -172,7 +174,7 @@ static int comment__symbol(char *raw, char *comment, u64 
*addrp, char **namep)
return 0;
 }
 
-static int lock__parse(struct ins_operands *ops)
+static int lock__parse(struct ins_operands *ops, const char *norm_arch)
 {
char *name;
 
@@ -183,7 +185,7 @@ static int lock__parse(struct ins_operands *ops)
if (disasm_line__parse(ops->raw, , >locked.ops->raw) < 0)
goto out_free_ops;
 
-   

[PATCH v2 0/4] perf annotate: Enable cross arch annotate

2016-06-29 Thread Ravi Bangoria
Perf can currently only support code navigation (branches and calls) in
annotate when run on the same architecture where perf.data was recorded.
But cross arch annotate is not supported.

This patchset enables cross arch annotate. Currently I've used x86
and arm instructions which are already available and adding support
for powerpc as well. Adding support for other arch will be easy.

I've created this patch on top of acme/perf/core. And tested it with
x86 and powerpc only.

Example:

  Record on powerpc:
  $ ./perf record -a

  Report -> Annotate on x86:
  $ ./perf report -i perf.data.powerpc --vmlinux vmlinux.powerpc

Changes in v2:
  - Corrected few memory leaks.
  - Created Dynamic list for powerpc to optimize memory consumption

Naveen N. Rao (1):
  perf annotate: add powerpc support

Ravi Bangoria (3):
  perf: Utility function to fetch arch
  perf annotate: Enable cross arch annotate
  perf: Define macro for arch names

 tools/perf/arch/common.c   |  36 +++---
 tools/perf/arch/common.h   |  11 ++
 tools/perf/builtin-top.c   |   2 +-
 tools/perf/ui/browsers/annotate.c  |   3 +-
 tools/perf/ui/gtk/annotate.c   |   2 +-
 tools/perf/util/annotate.c | 255 ++---
 tools/perf/util/annotate.h |   5 +-
 tools/perf/util/evsel.c|   7 +
 tools/perf/util/evsel.h|   2 +
 tools/perf/util/unwind-libunwind.c |   4 +-
 10 files changed, 255 insertions(+), 72 deletions(-)

--
2.5.5



[PATCH v2 4/4] perf annotate: Define macro for arch names

2016-06-29 Thread Ravi Bangoria
Define macro for each arch name and use them instead of using arch
name as string.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - No changes

 tools/perf/arch/common.c   | 36 ++--
 tools/perf/arch/common.h   | 11 +++
 tools/perf/util/annotate.c | 10 +-
 tools/perf/util/unwind-libunwind.c |  4 ++--
 4 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index ee69668..feb2113 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -122,25 +122,25 @@ static int lookup_triplets(const char *const *triplets, 
const char *name)
 const char *normalize_arch(char *arch)
 {
if (!strcmp(arch, "x86_64"))
-   return "x86";
+   return NORM_X86;
if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-   return "x86";
+   return NORM_X86;
if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-   return "sparc";
+   return NORM_SPARC;
if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
-   return "arm64";
+   return NORM_ARM64;
if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-   return "arm";
+   return NORM_ARM;
if (!strncmp(arch, "s390", 4))
-   return "s390";
+   return NORM_S390;
if (!strncmp(arch, "parisc", 6))
-   return "parisc";
+   return NORM_PARISC;
if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-   return "powerpc";
+   return NORM_POWERPC;
if (!strncmp(arch, "mips", 4))
-   return "mips";
+   return NORM_MIPS;
if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-   return "sh";
+   return NORM_SH;
 
return arch;
 }
@@ -180,21 +180,21 @@ static int perf_env__lookup_binutils_path(struct perf_env 
*env,
zfree();
}
 
-   if (!strcmp(arch, "arm"))
+   if (!strcmp(arch, NORM_ARM))
path_list = arm_triplets;
-   else if (!strcmp(arch, "arm64"))
+   else if (!strcmp(arch, NORM_ARM64))
path_list = arm64_triplets;
-   else if (!strcmp(arch, "powerpc"))
+   else if (!strcmp(arch, NORM_POWERPC))
path_list = powerpc_triplets;
-   else if (!strcmp(arch, "sh"))
+   else if (!strcmp(arch, NORM_SH))
path_list = sh_triplets;
-   else if (!strcmp(arch, "s390"))
+   else if (!strcmp(arch, NORM_S390))
path_list = s390_triplets;
-   else if (!strcmp(arch, "sparc"))
+   else if (!strcmp(arch, NORM_SPARC))
path_list = sparc_triplets;
-   else if (!strcmp(arch, "x86"))
+   else if (!strcmp(arch, NORM_X86))
path_list = x86_triplets;
-   else if (!strcmp(arch, "mips"))
+   else if (!strcmp(arch, NORM_MIPS))
path_list = mips_triplets;
else {
ui__error("binutils for %s not supported.\n", arch);
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index 6b01c73..14ca8ca 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -5,6 +5,17 @@
 
 extern const char *objdump_path;
 
+/* Macro for normalized arch names */
+#define NORM_X86   "x86"
+#define NORM_SPARC "sparc"
+#define NORM_ARM64 "arm64"
+#define NORM_ARM   "arm"
+#define NORM_S390  "s390"
+#define NORM_PARISC"parisc"
+#define NORM_POWERPC   "powerpc"
+#define NORM_MIPS  "mips"
+#define NORM_SH"sh"
+
 int perf_env__lookup_objdump(struct perf_env *env);
 const char *normalize_arch(char *arch);
 
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 812bfad..8c27486 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -68,7 +68,7 @@ static int call__parse(struct ins_operands *ops,
 
name++;
 
-   if (!strcmp(norm_arch, "arm") && strchr(name, '+'))
+   if (!strcmp(norm_arch, NORM_ARM) && strchr(name, '+'))
return -1;
 
tok = strchr(name, '>');
@@ -255,7 +255,7 @@ static int mov__parse(struct ins_operands *ops,
 
target = ++s;
 
-   if (!strcmp(norm_arch, "arm"))
+   if (!strcmp(norm_arch, NORM_ARM))
comment = strchr(s, ';');
else
  

[PATCH v2 1/4] perf: Utility function to fetch arch

2016-06-29 Thread Ravi Bangoria
Add Utility function to fetch arch using evsel. (evsel->env->arch)

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - No changes

 tools/perf/util/evsel.c | 7 +++
 tools/perf/util/evsel.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1d8f2bb..0fea724 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2422,3 +2422,10 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, 
struct target *target,
 err, strerror_r(err, sbuf, sizeof(sbuf)),
 perf_evsel__name(evsel));
 }
+
+char *perf_evsel__env_arch(struct perf_evsel *evsel)
+{
+   if (evsel && evsel->evlist && evsel->evlist->env)
+   return evsel->evlist->env->arch;
+   return NULL;
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 828ddd1..86fed7a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -435,4 +435,6 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const 
char *, void *);
 int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 attr__fprintf_f attr__fprintf, void *priv);
 
+char *perf_evsel__env_arch(struct perf_evsel *evsel);
+
 #endif /* __PERF_EVSEL_H */
-- 
2.5.5



[PATCH 3/5] perf/sdt/x86: Move OP parser to tools/perf/arch/x86/

2017-02-02 Thread Ravi Bangoria
SDT marker argument is in N@OP format. N is the size of argument and
OP is the actual assembly operand. OP is arch dependent component and
hence it's parsing logic also should be placed under tools/perf/arch/.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/x86/util/perf_regs.c |  93 -
 tools/perf/util/perf_regs.c  |   9 ++-
 tools/perf/util/perf_regs.h  |   7 +-
 tools/perf/util/probe-file.c | 127 +--
 4 files changed, 134 insertions(+), 102 deletions(-)

diff --git a/tools/perf/arch/x86/util/perf_regs.c 
b/tools/perf/arch/x86/util/perf_regs.c
index d8a8dcf..34fcb0d 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -3,6 +3,7 @@
 #include "../../perf.h"
 #include "../../util/util.h"
 #include "../../util/perf_regs.h"
+#include "../../util/debug.h"
 
 const struct sample_reg sample_reg_masks[] = {
SMPL_REG(AX, PERF_REG_X86_AX),
@@ -87,7 +88,16 @@ static const struct sdt_name_reg sdt_reg_renamings[] = {
SDT_NAME_REG_END,
 };
 
-int sdt_rename_register(char **pdesc, char *old_name)
+bool arch_sdt_probe_arg_supp(void)
+{
+   return true;
+}
+
+/*
+ * The table sdt_reg_renamings is used for adjusting gcc/gas-generated
+ * registers before filling the uprobe tracer interface.
+ */
+static int sdt_rename_register(char **pdesc, char *old_name)
 {
const struct sdt_name_reg *rnames = sdt_reg_renamings;
char *new_desc, *old_desc = *pdesc;
@@ -129,3 +139,84 @@ int sdt_rename_register(char **pdesc, char *old_name)
 
return 0;
 }
+
+/*
+ * x86 specific implementation
+ * return value:
+ * <0 : error
+ *  0 : success
+ * >0 : skip
+ */
+int arch_sdt_probe_parse_op(char **desc, const char **prefix)
+{
+   char *tmp;
+   int ret = 0;
+
+   /*
+* The uprobe tracer format does not support all the addressing
+* modes (notably: in x86 the scaled mode); so, we detect ','
+* characters, if there is just one, there is no use converting
+* the sdt arg into a uprobe one.
+*
+* Also it does not support constants; if we find one in the
+* current argument, let's skip the argument.
+*/
+   if (strchr(*desc, ',') || strchr(*desc, '$')) {
+   pr_debug4("Skipping unsupported SDT argument; %s\n", *desc);
+   return 1;
+   }
+
+   /*
+* If the argument addressing mode is indirect, we must check
+* a few things...
+*/
+   tmp = strchr(*desc, '(');
+   if (tmp) {
+   int j;
+
+   /*
+* ...if the addressing mode is indirect with a
+* positive offset (ex.: "1608(%ax)"), we need to add
+* a '+' prefix so as to be compliant with uprobe
+* format.
+*/
+   if ((*desc)[0] != '+' && (*desc)[0] != '-')
+   *prefix = ((*desc)[0] == '(') ? "+0" : "+";
+
+   /*
+* ...or if the addressing mode is indirect with a symbol
+* as offset, the argument will not be supported by
+* the uprobe tracer format; so, let's skip this one.
+*/
+   for (j = 0; j < tmp - *desc; j++) {
+   if ((*desc)[j] != '+' && (*desc)[j] != '-' &&
+   !isdigit((*desc)[j])) {
+   pr_debug4("Skipping unsupported SDT argument; "
+   "%s\n", *desc);
+   return 1;
+   }
+   }
+   }
+
+   /*
+* The uprobe parser does not support all gas register names;
+* so, we have to replace them (ex. for x86_64: %rax -> %ax);
+* the loop below looks for the register names (starting with
+* a '%' and tries to perform the needed renamings.
+*/
+   tmp = strchr(*desc, '%');
+   while (tmp) {
+   size_t offset = tmp - *desc;
+
+   ret = sdt_rename_register(desc, *desc + offset);
+   if (ret < 0)
+   return ret;
+
+   /*
+* The *desc pointer might have changed; so, let's not
+* try to reuse tmp for next lookup
+*/
+   tmp = strchr(*desc + offset + 1, '%');
+   }
+   return 0;
+}
diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
index a37e593..f2b3d0d 100644
--- a/tools/perf/util/perf_regs.c
+++ b/tools/perf/util/perf_regs.c
@@ -6,8 +6,13 @@ const struct sample_reg __weak sample_reg_masks[] = {
SMPL_REG_END
 };
 
-int __weak sdt_rename_register(char **pdesc __maybe_unused,
-   char *ol

[PATCH v2] perf/probe: Change MAX_CMDLEN

2017-02-06 Thread Ravi Bangoria
There are many SDT markers in powerpc whose uprobe definition goes
beyond current MAX_CMDLEN, especially when target filename is long
and sdt marker has long list of arguments. For example, definition
of sdt marker

  method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28)

from file

  /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/\
 ppc64/server/libjvm.so

is

  p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\
1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\
arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\
arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\
arg9=+37(%gpr28):u8

Perf probe fails with seg fault for such markers. As uprobe_events file
accepts definition upto 4094 characters(4096 - 2 (\n\0)), increase value
of MAX_CMDLEN to 4094.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - Set MAX_CMDLEN to 4094 instead of 512

 tools/perf/util/probe-event.c | 1 -
 tools/perf/util/probe-file.c  | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 6a6f44d..e6e3244 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -47,7 +47,6 @@
 #include "probe-file.h"
 #include "session.h"
 
-#define MAX_CMDLEN 256
 #define PERFPROBE_GROUP "probe"
 
 bool probe_event_dry_run;  /* Dry run flag */
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 38eca3c..fdabe7e 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -29,7 +29,8 @@
 #include "session.h"
 #include "perf_regs.h"
 
-#define MAX_CMDLEN 256
+/* 4096 - 2 ('\n' + '\0') */
+#define MAX_CMDLEN 4094
 
 static void print_open_warning(int err, bool uprobe)
 {
-- 
2.9.3



Re: [PATCH 3/5] perf/sdt/x86: Move OP parser to tools/perf/arch/x86/

2017-02-06 Thread Ravi Bangoria
Thanks Masami for the review.

On Tuesday 07 February 2017 08:41 AM, Masami Hiramatsu wrote:
> On Thu,  2 Feb 2017 16:41:41 +0530
> Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:
>
>> SDT marker argument is in N@OP format. N is the size of argument and
>> OP is the actual assembly operand. OP is arch dependent component and
>> hence it's parsing logic also should be placed under tools/perf/arch/.
>>
> Ok, I have one question. Is there any possibility that N is different
> size of OP? e.g. 8@dil, in this case we will record whole rdi.
> is that OK?

By looking at list of markers on my x86 Fedora25 box, yes, it's possible
for case when register size used in OP is more than size specified by N.
For example, -4@68(%rbx). But I don't see any argument which specifies
higher size in N compared to size of register in OP, like you mentioned
in your example.

Ravi



[PATCH 0/5] perf/sdt: Argument support for x86 and powepc

2017-02-02 Thread Ravi Bangoria
The v5 patchset for sdt marker argument support for x86 [1] has
couple  of issues. For example, it still has x86 specific code
in general code. It lacks support for rNN (with size postfix
b/w/d), %rsp, %esp, %sil etc. registers and such sdt markers
are failing at 'perf probe'. It also fails to convert arguments
having no offset but still surrounds register with parenthesis
for ex. 8@(%rdi) is converted to +(%di):u64 which is rejected
by uprobe_events. It's causing failure at 'perf probe' for all
SDT events on all archs except x86. With this patchset, I've
solved these issues. (patch 2,3)

Also, existing perf shows misleading message when user tries to
record sdt event without probing it. I've prepared patch for
the same. (patch 1)

Apart from that, I've also added logic to support arguments with
sdt marker on powerpc. (patch 4)

There are cases where uprobe definition of sdt event goes beyond
current limit MAX_CMDLEN (256) and in such case perf fails with
seg fault. I've solve this issue. (patch 5)

Note: This patchset is prepared on top of Alexis' v5 series.[1]

[1] http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1292251.html

Ravi Bangoria (5):
  perf/sdt: Show proper hint
  perf/sdt/x86: Add renaming logic for rNN and other registers
  perf/sdt/x86: Move OP parser to tools/perf/arch/x86/
  perf/sdt/powerpc: Add argument support
  perf/probe: Change MAX_CMDLEN

 tools/lib/api/fs/tracing_path.c  |  16 +++-
 tools/perf/arch/powerpc/util/perf_regs.c | 115 ++
 tools/perf/arch/x86/util/perf_regs.c | 137 ---
 tools/perf/util/perf_regs.c  |   9 +-
 tools/perf/util/perf_regs.h  |   7 +-
 tools/perf/util/probe-event.c|   1 -
 tools/perf/util/probe-file.c | 129 -
 7 files changed, 294 insertions(+), 120 deletions(-)

-- 
2.9.3



[PATCH 4/5] perf/sdt/powerpc: Add argument support

2017-02-02 Thread Ravi Bangoria
SDT marker argument is in N@OP format. Here OP is arch dependent
component. Add powerpc logic to parse OP and convert it to uprobe
compatible format.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/perf_regs.c | 115 +++
 1 file changed, 115 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/perf_regs.c 
b/tools/perf/arch/powerpc/util/perf_regs.c
index a3c3e1c..bbd6f91 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -1,5 +1,10 @@
+#include 
+#include 
+
 #include "../../perf.h"
+#include "../../util/util.h"
 #include "../../util/perf_regs.h"
+#include "../../util/debug.h"
 
 const struct sample_reg sample_reg_masks[] = {
SMPL_REG(r0, PERF_REG_POWERPC_R0),
@@ -47,3 +52,113 @@ const struct sample_reg sample_reg_masks[] = {
SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
SMPL_REG_END
 };
+
+bool arch_sdt_probe_arg_supp(void)
+{
+   return true;
+}
+
+static regex_t regex1, regex2;
+
+static int init_op_regex(void)
+{
+   static int initialized;
+
+   if (initialized)
+   return 0;
+
+   /* REG or %rREG */
+   if (regcomp(, "^(%r)?([1-2]?[0-9]|3[0-1])$", REG_EXTENDED))
+   goto error;
+
+   /* -NUM(REG) or NUM(REG) or -NUM(%rREG) or NUM(%rREG) */
+   if (regcomp(, "^(\\-)?([0-9]+)\\((%r)?([1-2]?[0-9]|3[0-1])\\)$",
+   REG_EXTENDED))
+   goto free_regex1;
+
+   initialized = 1;
+   return 0;
+
+free_regex1:
+   regfree();
+error:
+   pr_debug4("Regex compilation error.\n");
+   return -1;
+}
+
+/*
+ * Parse OP and convert it into uprobe format, which is, +/-NUM(%gprREG).
+ * Possible variants of OP are:
+ * Format  Example
+ * -
+ * NUM(REG)48(18)
+ * -NUM(REG)   -48(18)
+ * NUM(%rREG)  48(%r18)
+ * -NUM(%rREG) -48(%r18)
+ * REG 18
+ * %rREG   %r18
+ * iNUMi0
+ * i-NUM   i-1
+ *
+ * SDT marker arguments on Powerpc uses %rREG form with -mregnames flag
+ * and REG form with -mno-regnames. Here REG is general purpose register,
+ * which is in 0 to 31 range.
+ *
+ * return value of the function:
+ * <0 : error
+ *  0 : success
+ * >0 : skip
+ */
+int arch_sdt_probe_parse_op(char **desc, const char **prefix)
+{
+   char *tmp = NULL;
+   size_t new_len;
+   regmatch_t rm[5];
+
+   /* Constant argument. Uprobe does not support it */
+   if (*desc[0] == 'i') {
+   pr_debug4("Skipping unsupported SDT argument: %s\n", *desc);
+   return 1;
+   }
+
+   if (init_op_regex() < 0)
+   return -1;
+
+   if (!regexec(, *desc, 3, rm, 0)) {
+   /* REG or %rREG --> %gprREG */
+   new_len = 5;
+   new_len += (int)(rm[2].rm_eo - rm[2].rm_so);
+
+   tmp = zalloc(new_len);
+   if (!tmp)
+   return -1;
+
+   scnprintf(tmp, new_len, "%%gpr%.*s",
+   (int)(rm[2].rm_eo - rm[2].rm_so), *desc + rm[2].rm_so);
+   } else if (!regexec(, *desc, 5, rm, 0)) {
+   /*
+* -NUM(REG) or NUM(REG) or -NUM(%rREG) or NUM(%rREG) -->
+*  +/-NUM(%gprREG)
+*/
+   *prefix = (rm[1].rm_so == -1) ? "+" : "-";
+
+   new_len = 7;
+   new_len += (int)(rm[2].rm_eo - rm[2].rm_so);
+   new_len += (int)(rm[4].rm_eo - rm[4].rm_so);
+
+   tmp = zalloc(new_len);
+   if (!tmp)
+   return -1;
+
+   scnprintf(tmp, new_len, "%.*s(%%gpr%.*s)",
+   (int)(rm[2].rm_eo - rm[2].rm_so), *desc + rm[2].rm_so,
+   (int)(rm[4].rm_eo - rm[4].rm_so), *desc + rm[4].rm_so);
+   } else {
+   pr_debug4("Skipping unsupported SDT argument: %s\n", *desc);
+   return 1;
+   }
+
+   free(*desc);
+   *desc = tmp;
+   return 0;
+}
-- 
2.9.3



[PATCH 2/5] perf/sdt/x86: Add renaming logic for rNN and other registers

2017-02-02 Thread Ravi Bangoria
'perf probe' is failing for sdt markers whose arguments has rNN
(with postfix b/w/d), %rsp, %esp, %sil etc. registers. Add renaming
logic for these registers.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/arch/x86/util/perf_regs.c | 44 ++--
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/tools/perf/arch/x86/util/perf_regs.c 
b/tools/perf/arch/x86/util/perf_regs.c
index 09a7f55..d8a8dcf 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -48,10 +48,42 @@ static const struct sdt_name_reg sdt_reg_renamings[] = {
SDT_NAME_REG(rdx, dx),
SDT_NAME_REG(esi, si),
SDT_NAME_REG(rsi, si),
+   SDT_NAME_REG(sil, si),
SDT_NAME_REG(edi, di),
SDT_NAME_REG(rdi, di),
+   SDT_NAME_REG(dil, di),
SDT_NAME_REG(ebp, bp),
SDT_NAME_REG(rbp, bp),
+   SDT_NAME_REG(bpl, bp),
+   SDT_NAME_REG(rsp, sp),
+   SDT_NAME_REG(esp, sp),
+   SDT_NAME_REG(spl, sp),
+
+   /* rNN registers */
+   SDT_NAME_REG(r8b,  r8),
+   SDT_NAME_REG(r8w,  r8),
+   SDT_NAME_REG(r8d,  r8),
+   SDT_NAME_REG(r9b,  r9),
+   SDT_NAME_REG(r9w,  r9),
+   SDT_NAME_REG(r9d,  r9),
+   SDT_NAME_REG(r10b, r10),
+   SDT_NAME_REG(r10w, r10),
+   SDT_NAME_REG(r10d, r10),
+   SDT_NAME_REG(r11b, r11),
+   SDT_NAME_REG(r11w, r11),
+   SDT_NAME_REG(r11d, r11),
+   SDT_NAME_REG(r12b, r12),
+   SDT_NAME_REG(r12w, r12),
+   SDT_NAME_REG(r12d, r12),
+   SDT_NAME_REG(r13b, r13),
+   SDT_NAME_REG(r13w, r13),
+   SDT_NAME_REG(r13d, r13),
+   SDT_NAME_REG(r14b, r14),
+   SDT_NAME_REG(r14w, r14),
+   SDT_NAME_REG(r14d, r14),
+   SDT_NAME_REG(r15b, r15),
+   SDT_NAME_REG(r15w, r15),
+   SDT_NAME_REG(r15d, r15),
SDT_NAME_REG_END,
 };
 
@@ -88,18 +120,6 @@ int sdt_rename_register(char **pdesc, char *old_name)
 
/* Copy the chars after the register name (if need be) */
offset = prefix_len + sdt_len;
-   if (offset < old_desc_len) {
-   /*
-* The orginal register name can be suffixed by 'b',
-* 'w' or 'd' to indicate its size; so, we need to
-* skip this char if we met one.
-*/
-   char sfx = old_desc[offset];
-
-   if (sfx == 'b' || sfx == 'w'  || sfx == 'd')
-   offset++;
-   }
-
if (offset < old_desc_len)
memcpy(new_desc + prefix_len + uprobe_len,
old_desc + offset, old_desc_len - offset);
-- 
2.9.3



[PATCH 1/5] perf/sdt: Show proper hint

2017-02-02 Thread Ravi Bangoria
All events from 'perf list', except SDT events, can be directly
recorded with 'perf record'. But, the flow is little different
for SDT events. User has to probe on SDT events before recording
them. Perf is showing misleading message when user tries to
record SDT event without probing it. Show proper hint there.

Before patch:
  $ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
 \___ unknown tracepoint

Error:  File /sys/kernel/debug/tracing/events/sdt_glib/idle__add ...
Hint:   Perhaps this kernel misses some CONFIG_ setting to enable...
...

After patch:
  $ perf record -e sdt_glib:main__after_check
event syntax error: 'sdt_glib:idle__add'
 \___ unknown tracepoint

Error:  File /sys/kernel/debug/tracing/events/sdt_glib/idle__add ...
Hint:   SDT event has to be probed before recording it.

Suggested-by: Ingo Molnar <mi...@redhat.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/lib/api/fs/tracing_path.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c
index 251b7c3..a0e85df 100644
--- a/tools/lib/api/fs/tracing_path.c
+++ b/tools/lib/api/fs/tracing_path.c
@@ -99,10 +99,18 @@ static int strerror_open(int err, char *buf, size_t size, 
const char *filename)
 * - jirka
 */
if (debugfs__configured() || tracefs__configured()) {
-   snprintf(buf, size,
-"Error:\tFile %s/%s not found.\n"
-"Hint:\tPerhaps this kernel misses some 
CONFIG_ setting to enable this feature?.\n",
-tracing_events_path, filename);
+   /* sdt markers */
+   if (!strncmp(filename, "sdt_", 4)) {
+   snprintf(buf, size,
+   "Error:\tFile %s/%s not found.\n"
+   "Hint:\tSDT event has to be probed 
before recording it.\n",
+   tracing_events_path, filename);
+   } else {
+   snprintf(buf, size,
+"Error:\tFile %s/%s not found.\n"
+"Hint:\tPerhaps this kernel misses 
some CONFIG_ setting to enable this feature?.\n",
+tracing_events_path, filename);
+   }
break;
}
snprintf(buf, size, "%s",
-- 
2.9.3



[PATCH 5/5] perf/probe: Change MAX_CMDLEN

2017-02-02 Thread Ravi Bangoria
There are many SDT markers in powerpc whose uprobe definition goes
beyond current MAX_CMDLEN, especially when target filename is long
and sdt marker has long list of arguments. For example, definition
of sdt marker

  method__compile__end: 8@17 8@9 8@10 -4@8 8@7 -4@6 8@5 -4@4 1@37(28)

from file

  /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-2.b14.fc22.ppc64/jre/lib/\
 ppc64/server/libjvm.so

is

  p:sdt_hotspot/method__compile__end /usr/lib/jvm/java-1.8.0-openjdk-\
1.8.0.91-2.b14.fc22.ppc64/jre/lib/ppc64/server/libjvm.so:0x4c4e00\
arg1=%gpr17:u64 arg2=%gpr9:u64 arg3=%gpr10:u64 arg4=%gpr8:s32\
arg5=%gpr7:u64 arg6=%gpr6:s32 arg7=%gpr5:u64 arg8=%gpr4:s32\
arg9=+37(%gpr28):u8

Perf probe fails with seg fault for such markers. As uprobe_events file
accepts definition beyond 256 characters, increase value of MAX_CMDLEN
to 512.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c | 1 -
 tools/perf/util/probe-file.c  | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 2c1bca2..5f3256f 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -47,7 +47,6 @@
 #include "probe-file.h"
 #include "session.h"
 
-#define MAX_CMDLEN 256
 #define PERFPROBE_GROUP "probe"
 
 bool probe_event_dry_run;  /* Dry run flag */
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 38eca3c..1580e26 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -29,7 +29,7 @@
 #include "session.h"
 #include "perf_regs.h"
 
-#define MAX_CMDLEN 256
+#define MAX_CMDLEN 512
 
 static void print_open_warning(int err, bool uprobe)
 {
-- 
2.9.3



[PATCH v2] perf/sdt: Show proper hint

2017-02-03 Thread Ravi Bangoria
All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.
Probe point for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'. Perf shows misleading hint when user
tries to record SDT event without creating a probe point. Show proper
hint there.

Before patch:
  $ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
 \___ unknown tracepoint

Error: File /sys/kernel/debug/tracing/events/sdt_glib/idle__add not found.
Hint:  Perhaps this kernel misses some CONFIG_ setting to enable this 
feature?.
...

After patch:
  $ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
 \___ unknown tracepoint

Error: File /sys/kernel/debug/tracing/events/sdt_glib/idle__add not found.
Hint:  SDT event cannot be directly recorded on. Please use 'perf probe 
sdt_glib:idle__add' before recording it.
...

  $ perf probe sdt_glib:idle__add
Added new event:
  sdt_glib:idle__add   (on %idle__add in /usr/lib64/libglib-2.0.so.0.5000.2)

You can now use it in all perf tools, such as:

perf record -e sdt_glib:idle__add -aR sleep 1

  $ perf record -a -e sdt_glib:idle__add
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.175 MB perf.data ]

Suggested-by: Ingo Molnar <mi...@redhat.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
Changes in v2:
  - More precise hint

 tools/lib/api/fs/tracing_path.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c
index 251b7c3..aaafc99 100644
--- a/tools/lib/api/fs/tracing_path.c
+++ b/tools/lib/api/fs/tracing_path.c
@@ -86,9 +86,13 @@ void put_tracing_file(char *file)
free(file);
 }
 
-static int strerror_open(int err, char *buf, size_t size, const char *filename)
+int tracing_path__strerror_open_tp(int err, char *buf, size_t size,
+  const char *sys, const char *name)
 {
char sbuf[128];
+   char filename[PATH_MAX];
+
+   snprintf(filename, PATH_MAX, "%s/%s", sys, name ?: "*");
 
switch (err) {
case ENOENT:
@@ -99,10 +103,18 @@ static int strerror_open(int err, char *buf, size_t size, 
const char *filename)
 * - jirka
 */
if (debugfs__configured() || tracefs__configured()) {
-   snprintf(buf, size,
-"Error:\tFile %s/%s not found.\n"
-"Hint:\tPerhaps this kernel misses some 
CONFIG_ setting to enable this feature?.\n",
-tracing_events_path, filename);
+   /* sdt markers */
+   if (!strncmp(filename, "sdt_", 4)) {
+   snprintf(buf, size,
+   "Error:\tFile %s/%s not found.\n"
+   "Hint:\tSDT event cannot be directly 
recorded on. Please use 'perf probe %s:%s' before recording it.\n",
+   tracing_events_path, filename, sys, 
name);
+   } else {
+   snprintf(buf, size,
+"Error:\tFile %s/%s not found.\n"
+"Hint:\tPerhaps this kernel misses 
some CONFIG_ setting to enable this feature?.\n",
+tracing_events_path, filename);
+   }
break;
}
snprintf(buf, size, "%s",
@@ -125,12 +137,3 @@ static int strerror_open(int err, char *buf, size_t size, 
const char *filename)
 
return 0;
 }
-
-int tracing_path__strerror_open_tp(int err, char *buf, size_t size, const char 
*sys, const char *name)
-{
-   char path[PATH_MAX];
-
-   snprintf(path, PATH_MAX, "%s/%s", sys, name ?: "*");
-
-   return strerror_open(err, buf, size, path);
-}
-- 
2.9.3



Re: [PATCH v5 0/2] perf probe: add sdt probes arguments into the uprobe cmd string

2017-01-23 Thread Ravi Bangoria


On Wednesday 14 December 2016 01:06 PM, Ingo Molnar wrote:
> * Alexis Berlemont  wrote:
>
>> Hi Masami,
>>
>> Many thanks for your mail.
>>
>> Here is another patch set which tries to fix the points you mentioned:
>>
>> * Skip the arguments containing a constant ($123); 
>> * Review the code in charge of the register renaming (search for '%'
>>   and parse it);
>> * Minor changes (print the argument in case of error, skipping, check
>>   the sdt arg type index);
>>
>> Many thanks,
>>
>> Alexis.
>>
>> Alexis Berlemont (2):
>>   perf sdt: add scanning of sdt probles arguments
>>   perf probe: add sdt probes arguments into the uprobe cmd string
> I'd like to hijack this thread to report an SDT oddity - one of my boxen 
> reports 
> lots of SDT tracepoints in 'perf list':
>
>   mem:[/len][:access]  [Hardware breakpoint]
>
>   sdt_libc:lll_lock_wait_private [SDT event]
>   sdt_libc:longjmp   [SDT event]
>   sdt_libc:longjmp_target[SDT event]
>   sdt_libc:memory_arena_new  [SDT event]
>   sdt_libc:memory_arena_retry[SDT event]
>   sdt_libc:memory_arena_reuse[SDT event]
>   sdt_libc:memory_arena_reuse_free_list  [SDT event]
>   sdt_libc:memory_arena_reuse_wait   [SDT event]
>   sdt_libc:memory_calloc_retry   [SDT event]
>   sdt_libc:memory_heap_free  [SDT event]
>   ...
>
> But none of them work:
>
>   Error:  No permissions to read 
> /sys/kernel/debug/tracing/events/sdt_libc/longjmp
>   Hint:   Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
>
>   ...
>
>   Error:  File /sys/kernel/debug/tracing/events/sdt_libc/longjmp not found.
>   Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this 
> feature?.
>
> What kind of patches are required for SDT probes to work?

Hi Ingo,

Works for me on my x86 Fedora 25 box. May be some permission issue?

@Alexis, Planning to progress on it :) ? I would like to prepare patch for
powerpc.

Thanks,
Ravi

> Thanks,
>
>   Ingo
>



Re: [PATCH v5 0/2] perf probe: add sdt probes arguments into the uprobe cmd string

2017-01-24 Thread Ravi Bangoria


On Tuesday 24 January 2017 01:52 PM, Ingo Molnar wrote:
> * Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:
>
>>
>> On Wednesday 14 December 2016 01:06 PM, Ingo Molnar wrote:
>>> * Alexis Berlemont <alexis.berlem...@gmail.com> wrote:
>>>
>>>> Hi Masami,
>>>>
>>>> Many thanks for your mail.
>>>>
>>>> Here is another patch set which tries to fix the points you mentioned:
>>>>
>>>> * Skip the arguments containing a constant ($123); 
>>>> * Review the code in charge of the register renaming (search for '%'
>>>>   and parse it);
>>>> * Minor changes (print the argument in case of error, skipping, check
>>>>   the sdt arg type index);
>>>>
>>>> Many thanks,
>>>>
>>>> Alexis.
>>>>
>>>> Alexis Berlemont (2):
>>>>   perf sdt: add scanning of sdt probles arguments
>>>>   perf probe: add sdt probes arguments into the uprobe cmd string
>>> I'd like to hijack this thread to report an SDT oddity - one of my boxen 
>>> reports 
>>> lots of SDT tracepoints in 'perf list':
>>>
>>>   mem:[/len][:access]  [Hardware breakpoint]
>>>
>>>   sdt_libc:lll_lock_wait_private [SDT event]
>>>   sdt_libc:longjmp   [SDT event]
>>>   sdt_libc:longjmp_target[SDT event]
>>>   sdt_libc:memory_arena_new  [SDT event]
>>>   sdt_libc:memory_arena_retry[SDT event]
>>>   sdt_libc:memory_arena_reuse[SDT event]
>>>   sdt_libc:memory_arena_reuse_free_list  [SDT event]
>>>   sdt_libc:memory_arena_reuse_wait   [SDT event]
>>>   sdt_libc:memory_calloc_retry   [SDT event]
>>>   sdt_libc:memory_heap_free  [SDT event]
>>>   ...
>>>
>>> But none of them work:
>>>
>>>   Error:  No permissions to read 
>>> /sys/kernel/debug/tracing/events/sdt_libc/longjmp
>>>   Hint:   Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
>>>
>>>   ...
>>>
>>>   Error:  File /sys/kernel/debug/tracing/events/sdt_libc/longjmp not found.
>>>   Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this 
>>> feature?.
>>>
>>> What kind of patches are required for SDT probes to work?
>> Hi Ingo,
>>
>> I suppose you are trying to record SDT events without probing it.
>> In that case, first put a probe on an event and then try to record
>> it. For example,
>
> Well, I was mainly complaining about the misleading messages and flow of the 
> tooling here. Could you please improve the messages so that if I use it like 
> the 
> way I reported it results in me trying the right approach?

Right, message is misleading. Will prepare a patch for this.

Also it's little odd flow for sdt markers, to put a probe first and then
record it while other events can be recorded directly. There was a
patch by Hemant about directly recording SDT marker events. I
don't see any updates on that:

https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1138183.html

-Ravi

> Thanks,
>
>   Ingo
>



Re: [PATCH v5 2/2] perf probe: add sdt probes arguments into the uprobe cmd string

2017-01-24 Thread Ravi Bangoria
Hi Alexis,

On Wednesday 14 December 2016 05:37 AM, Alexis Berlemont wrote:
> An sdt probe can be associated with arguments but they were not passed
> to the user probe tracing interface (uprobe_events); this patch adapts
> the sdt argument descriptors according to the uprobe input format.
>
> As the uprobe parser does not support scaled address mode, perf will
> skip arguments which cannot be adapted to the uprobe format.
>
> Here are the results:
>
> $ perf buildid-cache -v --add test_sdt
> $ perf probe -x test_sdt sdt_libfoo:table_frob
> $ perf probe -x test_sdt sdt_libfoo:table_diddle
> $ perf record -e sdt_libfoo:table_frob -e sdt_libfoo:table_diddle test_sdt
> $ perf script
> test_sdt  ...   666.255678:   sdt_libfoo:table_frob: (4004d7) arg0=0 arg1=0
> test_sdt  ...   666.255683: sdt_libfoo:table_diddle: (40051a) arg0=0 arg1=0
> test_sdt  ...   666.255686:   sdt_libfoo:table_frob: (4004d7) arg0=1 arg1=2
> test_sdt  ...   666.255689: sdt_libfoo:table_diddle: (40051a) arg0=3 arg1=4
> test_sdt  ...   666.255692:   sdt_libfoo:table_frob: (4004d7) arg0=2 arg1=4
> test_sdt  ...   666.255694: sdt_libfoo:table_diddle: (40051a) arg0=6 arg1=8
>
> Signed-off-by: Alexis Berlemont 
> ---
>  tools/perf/arch/x86/util/perf_regs.c |  83 +
>  tools/perf/util/perf_regs.c  |   6 ++
>  tools/perf/util/perf_regs.h  |   6 ++
>  tools/perf/util/probe-file.c | 170 
> ++-
>  4 files changed, 261 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/arch/x86/util/perf_regs.c 
> b/tools/perf/arch/x86/util/perf_regs.c
> index c5db14f..09a7f55 100644
> --- a/tools/perf/arch/x86/util/perf_regs.c
> +++ b/tools/perf/arch/x86/util/perf_regs.c
> @@ -1,4 +1,7 @@
> +#include 
> +
>  #include "../../perf.h"
> +#include "../../util/util.h"
>  #include "../../util/perf_regs.h"
>
>  const struct sample_reg sample_reg_masks[] = {
> @@ -26,3 +29,83 @@ const struct sample_reg sample_reg_masks[] = {
>  #endif
>   SMPL_REG_END
>  };
> +
> +struct sdt_name_reg {
> + const char *sdt_name;
> + const char *uprobe_name;
> +};
> +#define SDT_NAME_REG(n, m) {.sdt_name = "%" #n, .uprobe_name = "%" #m}
> +#define SDT_NAME_REG_END {.sdt_name = NULL, .uprobe_name = NULL}
> +
> +static const struct sdt_name_reg sdt_reg_renamings[] = {
> + SDT_NAME_REG(eax, ax),
> + SDT_NAME_REG(rax, ax),
> + SDT_NAME_REG(ebx, bx),
> + SDT_NAME_REG(rbx, bx),
> + SDT_NAME_REG(ecx, cx),
> + SDT_NAME_REG(rcx, cx),
> + SDT_NAME_REG(edx, dx),
> + SDT_NAME_REG(rdx, dx),
> + SDT_NAME_REG(esi, si),
> + SDT_NAME_REG(rsi, si),
> + SDT_NAME_REG(edi, di),
> + SDT_NAME_REG(rdi, di),
> + SDT_NAME_REG(ebp, bp),
> + SDT_NAME_REG(rbp, bp),
> + SDT_NAME_REG_END,
> +};

I see many markers uses %rsp. Such markers are failing at perf probe.
Please add renaming entry for %rsp and %esp. For example:

$ readelf -n /usr/lib64/libpython3.5m.so.1.0
  ...
  Name: function__entry
  Arguments: 8@224(%rsp) 8@232(%rsp) -4@240(%rsp) 8@%rbx

$ sudo ./perf probe sdt_python:function__entry
  Failed to write event: Invalid argument
  Please upgrade your kernel to at least 3.14 to have access to feature 
+224(%rsp):u64
Error: Failed to add events.


This code does not handle rNN registers with postfix('b', 'w', 'd').
Such markers are failing at perf probe. For example:

$ readelf -n /usr/lib64/libperl.so.5.24.0
  ...
  Name: sub__return
  Arguments: 8@%rax 8@%r8 4@%r9d 8@%rsi

$ sudo ./perf probe -v sdt_perl:sub__entry
  ...
  Opening /sys/kernel/debug/tracing//uprobe_events write=1
  Writing event: p:sdt_perl/sub__entry /usr/lib64/libperl.so.5.24.0:0xbb780 
arg1=%ax:u64 arg2=%r8:u64 arg3=%r9d:u32 arg4=%si:u64
  Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)

 Can we add them like:

/* rNN registers */
SDT_NAME_REG(r8b,  r8),
SDT_NAME_REG(r8w,  r8),
SDT_NAME_REG(r8d,  r8),
SDT_NAME_REG(r9b,  r9),
...
SDT_NAME_REG(r14d, r14),
SDT_NAME_REG(r15b, r15),
SDT_NAME_REG(r15w, r15),
SDT_NAME_REG(r15d, r15),

and ...

> +
> +int sdt_rename_register(char **pdesc, char *old_name)
> +{
> + const struct sdt_name_reg *rnames = sdt_reg_renamings;
> + char *new_desc, *old_desc = *pdesc;
> + size_t prefix_len, sdt_len, uprobe_len, old_desc_len, offset;
> + int ret = -1;
> +
> + while (ret != 0 && rnames->sdt_name != NULL) {
> + sdt_len = strlen(rnames->sdt_name);
> + ret = strncmp(old_name, rnames->sdt_name, sdt_len);
> + rnames += !!ret;
> + }
> +
> + if (rnames->sdt_name == NULL)
> + return 0;
> +
> + sdt_len = strlen(rnames->sdt_name);
> + uprobe_len = strlen(rnames->uprobe_name);
> + old_desc_len = strlen(old_desc) + 1;
> +
> + new_desc = zalloc(old_desc_len + uprobe_len - sdt_len);
> + if (new_desc == NULL)
> + return -1;
> +
> + /* 

Re: [PATCH v5 0/2] perf probe: add sdt probes arguments into the uprobe cmd string

2017-01-23 Thread Ravi Bangoria


On Wednesday 14 December 2016 01:06 PM, Ingo Molnar wrote:
> * Alexis Berlemont  wrote:
>
>> Hi Masami,
>>
>> Many thanks for your mail.
>>
>> Here is another patch set which tries to fix the points you mentioned:
>>
>> * Skip the arguments containing a constant ($123); 
>> * Review the code in charge of the register renaming (search for '%'
>>   and parse it);
>> * Minor changes (print the argument in case of error, skipping, check
>>   the sdt arg type index);
>>
>> Many thanks,
>>
>> Alexis.
>>
>> Alexis Berlemont (2):
>>   perf sdt: add scanning of sdt probles arguments
>>   perf probe: add sdt probes arguments into the uprobe cmd string
> I'd like to hijack this thread to report an SDT oddity - one of my boxen 
> reports 
> lots of SDT tracepoints in 'perf list':
>
>   mem:[/len][:access]  [Hardware breakpoint]
>
>   sdt_libc:lll_lock_wait_private [SDT event]
>   sdt_libc:longjmp   [SDT event]
>   sdt_libc:longjmp_target[SDT event]
>   sdt_libc:memory_arena_new  [SDT event]
>   sdt_libc:memory_arena_retry[SDT event]
>   sdt_libc:memory_arena_reuse[SDT event]
>   sdt_libc:memory_arena_reuse_free_list  [SDT event]
>   sdt_libc:memory_arena_reuse_wait   [SDT event]
>   sdt_libc:memory_calloc_retry   [SDT event]
>   sdt_libc:memory_heap_free  [SDT event]
>   ...
>
> But none of them work:
>
>   Error:  No permissions to read 
> /sys/kernel/debug/tracing/events/sdt_libc/longjmp
>   Hint:   Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
>
>   ...
>
>   Error:  File /sys/kernel/debug/tracing/events/sdt_libc/longjmp not found.
>   Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this 
> feature?.
>
> What kind of patches are required for SDT probes to work?

Hi Ingo,

I suppose you are trying to record SDT events without probing it.
In that case, first put a probe on an event and then try to record
it. For example,

$ ./perf list | grep sdt_
  sdt_glib:main__after_prepare   [SDT event]
  sdt_glib:main__before_dispatch [SDT event]
  ...

$ ./perf record -a -e sdt_glib:main__after_prepare
  event syntax error: 'sdt_glib:main__after_prepare'
   \___ unknown tracepoint

  Error:  File /sys/kernel/debug/tracing/events/sdt_glib/main__after_prepare 
not found.
  Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
  ...

$ ./perf probe sdt_glib:main__after_prepare
  Added new events:
sdt_glib:main__after_prepare (on %main__after_prepare in 
/usr/lib64/libglib-2.0.so.0.5000.2)
sdt_glib:main__after_prepare_1 (on %main__after_prepare in 
/usr/lib64/libglib-2.0.so.0.5000.2)

  You can now use it in all perf tools, such as:

perf record -e sdt_glib:main__after_prepare_1 -aR sleep 1

$ ./perf record -a -e sdt_glib:main__after_prepare
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.191 MB perf.data ]

-Ravi



Re: [RFC] perf/sdt: Directly record SDT event with 'perf record'

2017-02-20 Thread Ravi Bangoria
Thanks Ingo,

On Monday 20 February 2017 02:12 PM, Ingo Molnar wrote:
> * Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com> wrote:
>
>> What should be the behavior of the tool? Should it record only one
>> 'sdt_libpthread:mutex_entry' which exists in uprobe_events? Or it
>> should record all the SDT events from libpthread? We can choose either
>> of two but both the cases are ambiguous.
> They are not ambiguous really if coded right: just pick one of the outcomes 
> and 
> maybe print a warning to inform the user that something weird is going on 
> because 
> not all markers are enabled?
>
> As a user I'd expect 'perf record' to enable all markers and print a warning 
> that 
> the markers were in a partial state. This would result in consistent 
> behaviour.

Yes, makes sense.

> Does it make sense to only enable some of the markers that alias on the same 
> name? 
> If not then maybe disallow that in perf probe - or change perf probe to do 
> the 
> same thing as perf record.

'perf probe' is doing that correctly. It fetches all events with given name from
probe-cache and creates entries for them in uprobe_events.

The problem is the 2-step process of adding probes and then recording,
allowing users to select individual markers to record on.

>
> I.e. this is IMHO an artificial problem that users should not be exposed to 
> and 
> which can be solved by tooling.
>
> In particular if it's possible to enable only a part of the markers then perf 
> record not continuing would be a failure mode: if for example a previous perf 
> record session segfaulted (or ran out of RAM or was killed in the wrong 
> moment or 
> whatever) then it would not be possible to (easily) clean up the mess.

Agreed. We need to make this more robust.

>
>> Not allowing 'perf probe' for SDT event will solve all such issues.
>> Also it will make user interface simple and consistent. Other current
>> tooling (systemtap, for instance) also do not allow probing individual
>> markers when there are multiple markers with the same name.
> In any case if others agree with your change in UI flow too then it's fine by 
> me, 
> but please make it robust, i.e. if perf record sees partially enabled probes 
> it 
> should still continue.

@Masami, can you please provide your thoughts as well.

Thanks,
Ravi



[PATCH v3 2/2] perf/sdt: Directly record SDT events with 'perf record'

2017-02-23 Thread Ravi Bangoria
From: Hemant Kumar <hem...@linux.vnet.ibm.com>

Add support for directly recording SDT events which are present in
the probe cache. Without this patch, we could probe into SDT events
using 'perf probe' and 'perf record'. With this patch, we can probe
the SDT events directly using 'perf record'.

For example :

  $ perf list sdt
sdt_libpthread:mutex_entry [SDT event]
sdt_libc:setjmp[SDT event]
...

  $ perf record -a -e sdt_libc:setjmp
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.286 MB perf.data (1065 samples) ]

  $ perf script
 bash   803 [002]  6492.190311: sdt_libc:setjmp: (7f1d503b56a1)
login   488 [001]  6496.791583: sdt_libc:setjmp: (7ff3013d56a1)
  fprintd 11038 [003]  6496.808032: sdt_libc:setjmp: (7fdedf5936a1)

Recording on SDT events with same provider and marker names is also
supported:

  $ readelf -n /usr/lib64/libpthread-2.24.so | grep -A2 Provider
  Provider: libpthread
  Name: mutex_entry
  Location: 0x9ddb, Base: 0x000139cc, ...
--
  Provider: libpthread
  Name: mutex_entry
  Location: 0xbcbb, Base: 0x000139cc, ...

  $ perf record -a -e sdt_libpthread:mutex_entry
Info: Recording on 2 occurrences of sdt_libpthread:mutex_entry
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.197 MB perf.data (31 samples) ]

  $ perf script
  in:imjournal 442 [000]  6625.701833:   sdt_libpthread:mutex_entry: 
(7fb1a1940ddb)
 rs:main Q:Reg 443 [001]  6625.701889: sdt_libpthread:mutex_entry_1: 
(7fb1a1942cbb)


After invoking 'perf record', behind the scenes, it checks whether
the event specified is an SDT event by using the flag '%' or string
'sdt_'. After that, it first checks whether event already exists
with that *name* in uprobe_events file. If found, it records that
particular event. Otherwise, it does a lookup of the probe cache to
find out the SDT event. If its not present, it throws an error.
If found, it again tries to find existing events from uprobe_events
file, but this time it uses *address* and *filename* for comparison.
Finally it writes new events into the uprobe_events file and starts
recording. It also maintains a list of the event names that were
written to uprobe_events file for this session. After finishing the
record session, it removes the events from the uprobe_events file
using the maintained name list.

Signed-off-by: Hemant Kumar <hem...@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/lib/api/fs/tracing_path.c |  17 +-
 tools/perf/builtin-probe.c  |  21 ++-
 tools/perf/builtin-record.c |  23 +++
 tools/perf/perf.h   |   2 +
 tools/perf/util/parse-events.c  |  56 +-
 tools/perf/util/parse-events.h  |   2 +
 tools/perf/util/probe-event.c   |  44 -
 tools/perf/util/probe-event.h   |   4 +
 tools/perf/util/probe-file.c| 376 
 tools/perf/util/probe-file.h|  27 +++
 10 files changed, 552 insertions(+), 20 deletions(-)

diff --git a/tools/lib/api/fs/tracing_path.c b/tools/lib/api/fs/tracing_path.c
index 3e606b9..fa52e67 100644
--- a/tools/lib/api/fs/tracing_path.c
+++ b/tools/lib/api/fs/tracing_path.c
@@ -103,19 +103,10 @@ int tracing_path__strerror_open_tp(int err, char *buf, 
size_t size,
 * - jirka
 */
if (debugfs__configured() || tracefs__configured()) {
-   /* sdt markers */
-   if (!strncmp(filename, "sdt_", 4)) {
-   snprintf(buf, size,
-   "Error:\tFile %s/%s not found.\n"
-   "Hint:\tSDT event cannot be directly 
recorded on.\n"
-   "\tPlease first use 'perf probe %s:%s' 
before recording it.\n",
-   tracing_events_path, filename, sys, 
name);
-   } else {
-   snprintf(buf, size,
-"Error:\tFile %s/%s not found.\n"
-"Hint:\tPerhaps this kernel misses 
some CONFIG_ setting to enable this feature?.\n",
-tracing_events_path, filename);
-   }
+   snprintf(buf, size,
+"Error:\tFile %s/%s not found.\n"
+"Hint:\tPerhaps this kernel misses some 
CONFIG_ setting to enable this feature?.\n",
+tracing_events_path, filename);
break;
}
snprintf(buf, size, "%s",
diff --git a/tools/perf/builtin-pro

[PATCH v3 1/2] perf/sdt: Introduce util func is_sdt_event()

2017-02-23 Thread Ravi Bangoria
No Functionality changes.

Signed-off-by: Ravi Bangoria <ravi.bango...@linux.vnet.ibm.com>
---
 tools/perf/util/probe-event.c |  9 +
 tools/perf/util/util.c| 12 
 tools/perf/util/util.h|  2 ++
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 28fb62c..2b1409f 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1339,14 +1339,7 @@ static int parse_perf_probe_point(char *arg, struct 
perf_probe_event *pev)
if (!arg)
return -EINVAL;
 
-   /*
-* If the probe point starts with '%',
-* or starts with "sdt_" and has a ':' but no '=',
-* then it should be a SDT/cached probe point.
-*/
-   if (arg[0] == '%' ||
-   (!strncmp(arg, "sdt_", 4) &&
-!!strchr(arg, ':') && !strchr(arg, '='))) {
+   if (is_sdt_event(arg)) {
pev->sdt = true;
if (arg[0] == '%')
arg++;
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index d8b45ce..b827428 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -802,3 +802,15 @@ int unit_number__scnprintf(char *buf, size_t size, u64 n)
 
return scnprintf(buf, size, "%" PRIu64 "%c", n, unit[i]);
 }
+
+/*
+ * If the probe point starts with '%',
+ * or starts with "sdt_" and has a ':' but no '=',
+ * then it should be a SDT/cached probe point.
+ */
+bool is_sdt_event(char *str)
+{
+   return (str[0] == '%' ||
+   (!strncmp(str, "sdt_", 4) &&
+!!strchr(str, ':') && !strchr(str, '=')));
+}
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index c74708d..ee599dc 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -364,4 +364,6 @@ int is_printable_array(char *p, unsigned int len);
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
 
 int unit_number__scnprintf(char *buf, size_t size, u64 n);
+
+bool is_sdt_event(char *str);
 #endif /* GIT_COMPAT_UTIL_H */
-- 
2.9.3



[PATCH v3 0/2] perf/sdt: Directly record SDT events with 'perf record'

2017-02-23 Thread Ravi Bangoria
 with 0xbcbb address, but also it
has created new entry for event with 0x9ddb address. It also maintains
list of entries created for particular record session, and uses that
list to remove entries at the end of session.

Finally, If somehow tool fails to clean events from uprobe_events at
the end of session, user has to clean events manually with
'perf probe -d'. But perf will give Warning in such case. For ex,

  $ perf record -a -e sdt_libpthread:mutex_entry
Warning: Recording on 2 occurrences of sdt_libpthread:mutex_entry
/** Fails with segfault **/

  $ cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_libpthread/mutex_entry 
/usr/lib64/libpthread-2.24.so:0x9ddb
p:sdt_libpthread/mutex_entry_1 
/usr/lib64/libpthread-2.24.so:0xbcbb

When next time user tries to record, it will show a warning:

  $ perf record -a -e sdt_libpthread:mutex_entry
Matching event(s) from uprobe_events:
   sdt_libpthread:mutex_entry  0x9ddb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d ' to delete event(s).

Warning: Found 2 events from probe-cache with name 
'sdt_libpthread:mutex_entry'.
 Since probe point already exists with this name, recording 
only 1 event.
Hint: Please use 'perf probe -d sdt_libpthread:mutex_entry*' to allow 
record on all events.

But no such warning for 'sdt_libpthread:mutex_entry_1'.

  $ perf record -a -e sdt_libpthread:mutex_entry_1
Matching event(s) from uprobe_events:
  sdt_libpthread:mutex_entry_1  0xbcbb@/usr/lib64/libpthread-2.24.so
Use 'perf probe -d ' to delete event(s).


[1] https://lkml.org/lkml/2017/2/7/59
[2] https://lkml.org/lkml/2016/5/3/810
[3] https://lkml.org/lkml/2016/5/2/689


Hemant Kumar (1):
  perf/sdt: Directly record SDT events with 'perf record'

Ravi Bangoria (1):
  perf/sdt: Introduce util func is_sdt_event()

 tools/lib/api/fs/tracing_path.c |  17 +-
 tools/perf/builtin-probe.c  |  21 ++-
 tools/perf/builtin-record.c |  23 +++
 tools/perf/perf.h   |   2 +
 tools/perf/util/parse-events.c  |  56 +-
 tools/perf/util/parse-events.h  |   2 +
 tools/perf/util/probe-event.c   |  53 +-
 tools/perf/util/probe-event.h   |   4 +
 tools/perf/util/probe-file.c| 376 
 tools/perf/util/probe-file.h|  27 +++
 tools/perf/util/util.c  |  12 ++
 tools/perf/util/util.h  |   2 +
 12 files changed, 567 insertions(+), 28 deletions(-)

-- 
2.9.3



Re: [PATCH 4/4] perf annotate: Introduce source_code to collect actual code

2017-02-23 Thread Ravi Bangoria
Hi Taeung,

On Wednesday 22 February 2017 03:38 PM, Taeung Song wrote:
> + INIT_LIST_HEAD(>src->code);
> +
> + while (!feof(file)) {
> + int nr;
> + char *c, *parsed_line;
> + struct source_code *code;
> +
> + if (getline(, , file) < 0) {
> + symbol__free_source_code(sym);
> + break;
> + }
> +
> + if (++nr < first_linenr)

Please initialize variable nr. I got a compilation error:

util/annotate.c: In function ‘symbol__tty_annotate’:
util/annotate.c:1674:6: error: ‘nr’ may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
   if (++nr < first_linenr)
  ^
util/annotate.c:1665:7: note: ‘nr’ was declared here
   int nr;
   ^

Ravi



  1   2   3   4   5   6   7   8   9   10   >