Re: [PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump

2024-05-09 Thread Athira Rajeev



> On 7 May 2024, at 3:05 PM, Christophe Leroy  
> wrote:
> 
> 
> 
> Le 06/05/2024 à 14:19, Athira Rajeev a écrit :
>> Add support to capture and parse raw instruction in objdump.
> 
> What's the purpose of using 'objdump' for reading raw instructions ? 
> Can't they be read directly without invoking 'objdump' ? It looks odd to 
> me to use objdump to provide readable text and then parse it back.

Hi Christophe,

Thanks for your review comments.

Current implementation for data type profiling on X86 uses "objdump" tool to 
get the disassembled code.
And then the objdump result lines are parsed to get the instruction name and 
register fields. The initial patchset I posted to enable the data type 
profiling feature in powerpc was using the same way by getting disassembled 
code from objdump and parsing the disassembled lines. But in V2, we are 
introducing change for powerpc to use "raw instruction" and fetch opcode, reg 
fields from the raw instruction.

I tried to explain below that current objdump uses option "--no-show-raw-insn" 
which doesn't capture raw instruction.  So to capture raw instruction, V2 
patchset has changes to use default option "--show-raw-insn" and get the raw 
instruction [ for powerpc ] along with human readable annotation [ which is 
used by other archs ]. Since perf tool already has objdump implementation in 
place, I went in the direction to enhance it to use "--show-raw-insn" for 
powerpc purpose.

But as you mentioned, we can directly read raw instruction without using 
"objdump" tool.
perf has support to read object code. The dso open/read utilities and helper 
functions are already present in "util/dso.c" And "dso__data_read_offset" 
function reads data from dso file offset. We can use these functions and I can 
make changes to directly read binary instruction without using objdump.

Namhyung, Arnaldo, Christophe
Looking for your valuable feedback on this approach. Please suggest if this 
approach looks fine


Thanks
Athira
> 
>> Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
>> with "objdump" while disassemble. Example from powerpc with this option
>> for an instruction address is:
> 
> Yes and that makes sense because the purpose of objdump is to provide 
> human readable annotations, not to perform automated analysis. Am I 
> missing something ?
> 
>> 
>> Snippet from:
>> objdump  --start-address= --stop-address=  -d 
>> --no-show-raw-insn -C 
>> 
>> c10224b4: lwz r10,0(r9)
>> 
>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>> registers names and offset. Also to find whether there is a memory
>> reference in the operands, "memory_ref_char" field of objdump is used.
>> For x86, "(" is used as memory_ref_char to tackle instructions of the
>> form "mov  (%rax), %rcx".
>> 
>> In case of powerpc, not all instructions using "(" are the only memory
>> instructions. Example, above instruction can also be of extended form (X
>> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
>> and extract the source/target registers, patch adds support to use raw
>> instruction. With raw instruction, macros are added to extract opcode
>> and register fields.
>> 
>> "struct ins_operands" and "struct ins" is updated to carry opcode and
>> raw instruction binary code (raw_insn). Function "disasm_line__parse"
>> is updated to fill the raw instruction hex value and opcode in newly
>> added fields. There is no changes in existing code paths, which parses
>> the disassembled code. The architecture using the instruction name and
>> present approach is not altered. Since this approach targets powerpc,
>> the macro implementation is added for powerpc as of now.
>> 
>> Example:
>> representation using --show-raw-insn in objdump gives result:
>> 
>> 38 01 81 e8 ld  r4,312(r1)
>> 
>> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
>> this translates to instruction form: "ld RT,DS(RA)" and binary code
>> as:
>> _
>> | 58 |  RT  |  RA |  DS   | |
>> -
>> 06 1116  30 31
>> 
>> Function "disasm_line__parse" is updated to capture:
>> 
>> line:38 01 81 e8 ld  r4,312(r1)
>> opcode and raw instruction "38 01 81 e8"
>> Raw instruction is used later to extract the reg/offset fields.
>> 
>> Signed-off-by: Athira Rajeev 
>> ---



[PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump

2024-05-06 Thread Athira Rajeev
Add support to capture and parse raw instruction in objdump.
Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
with "objdump" while disassemble. Example from powerpc with this option
for an instruction address is:

Snippet from:
objdump  --start-address= --stop-address=  -d 
--no-show-raw-insn -C 

c10224b4:   lwz r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov  (%rax), %rcx".

In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, patch adds support to use raw
instruction. With raw instruction, macros are added to extract opcode
and register fields.

"struct ins_operands" and "struct ins" is updated to carry opcode and
raw instruction binary code (raw_insn). Function "disasm_line__parse"
is updated to fill the raw instruction hex value and opcode in newly
added fields. There is no changes in existing code paths, which parses
the disassembled code. The architecture using the instruction name and
present approach is not altered. Since this approach targets powerpc,
the macro implementation is added for powerpc as of now.

Example:
representation using --show-raw-insn in objdump gives result:

38 01 81 e8 ld  r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
_
| 58 |  RT  |  RA |  DS   | |
-
06 1116  30 31

Function "disasm_line__parse" is updated to capture:

line:38 01 81 e8 ld  r4,312(r1)
opcode and raw instruction "38 01 81 e8"
Raw instruction is used later to extract the reg/offset fields.

Signed-off-by: Athira Rajeev 
---
 tools/include/linux/string.h  |  2 +
 tools/lib/string.c| 13 +++
 tools/perf/arch/powerpc/util/dwarf-regs.c | 19 ++
 tools/perf/util/disasm.c  | 46 +++
 tools/perf/util/disasm.h  |  6 +++
 tools/perf/util/include/dwarf-regs.h  |  9 +
 6 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
index db5c99318c79..0acb1fc14e19 100644
--- a/tools/include/linux/string.h
+++ b/tools/include/linux/string.h
@@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
 
 extern char *strim(char *);
 
+extern void remove_spaces(char *s);
+
 extern void *memchr_inv(const void *start, int c, size_t bytes);
 #endif /* _TOOLS_LINUX_STRING_H_ */
diff --git a/tools/lib/string.c b/tools/lib/string.c
index 8b6892f959ab..21d273e69951 100644
--- a/tools/lib/string.c
+++ b/tools/lib/string.c
@@ -153,6 +153,19 @@ char *strim(char *s)
return skip_spaces(s);
 }
 
+/*
+ * remove_spaces - Removes whitespaces from @s
+ */
+void remove_spaces(char *s)
+{
+   char *d = s;
+   do {
+   while (*d == ' ') {
+   ++d;
+   }
+   } while ((*s++ = *d++));
+}
+
 /**
  * strreplace - Replace all occurrences of character in string.
  * @s: The string to operate on.
diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c 
b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 0c4f4caf53ac..e60a71fd846e 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -98,3 +98,22 @@ int regs_query_register_offset(const char *name)
return roff->ptregs_offset;
return -EINVAL;
 }
+
+#definePPC_OP(op)  (((op) >> 26) & 0x3F)
+#define PPC_RA(a)  (((a) >> 16) & 0x1f)
+#define PPC_RT(t)  (((t) >> 21) & 0x1f)
+
+int get_opcode_insn(unsigned int raw_insn)
+{
+   return PPC_OP(raw_insn);
+}
+
+int get_source_reg(unsigned int raw_insn)
+{
+   return PPC_RA(raw_insn);
+}
+
+int get_target_reg(unsigned int raw_insn)
+{
+   return PPC_RT(raw_insn);
+}
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 2de66a092cab..85692f73e78f 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -43,7 +43,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t 
size,
   struct ins_operands *ops, int max_ins_name);
 
 static void ins__sort(struct arch *arch);
-static int disasm_line__parse(char *line, const char **namep, char **rawp);
+static int disasm_line_

[PATCH V2 9/9] tools/perf: Add support for global_die to capture name of variable in case of register defined variable

2024-05-06 Thread Athira Rajeev
In case of register defined variable (found using
find_data_type_global_reg), if the type of variable happens to be base
type (example, long unsigned int), perf report captures it as:

12.85%  long unsigned int  long unsigned int +0 (no field)

The above data type is actually referring to samples captured while
accessing "r1" which represents current stack pointer in powerpc.
register void *__stack_pointer asm("r1");

The dwarf debug contains this as:

<<>>
 <1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
<18dd774>   DW_AT_name: (indirect string, offset: 0x11ba): 
current_stack_pointer
<18dd778>   DW_AT_decl_file   : 51
<18dd779>   DW_AT_decl_line   : 1468
<18dd77b>   DW_AT_decl_column : 24
<18dd77c>   DW_AT_type: <0x18da5cd>
<18dd780>   DW_AT_external: 1
<18dd780>   DW_AT_location: 1 byte block: 51(DW_OP_reg1 (r1))

 where 18da5cd is:

 <1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
<18da5ce>   DW_AT_byte_size   : 8
<18da5cf>   DW_AT_encoding: 7   (unsigned)
<18da5d0>   DW_AT_name: (indirect string, offset: 0x55c7): long 
unsigned int
<<>>

To make it more clear to the user, capture the DW_AT_name of the
variable and save it as part of Dwarf_Global. Dwarf_Global is used so
that it can be used and retrieved while presenting the result.

Update "dso__findnew_data_type" function to set "var_name" if
variable name is set as part of Dwarf_Global. Updated
"hist_entry__typeoff_snprintf" to print var_name if it is set.
With the changes, along with "long unsigned int" report also says the
variable name as current_stack_pointer

Snippet of result:

12.85%  long unsigned int  long unsigned int +0 (current_stack_pointer)
 4.68%  struct paca_struct  struct paca_struct +2312 (__current)
 4.57%  struct paca_struct  struct paca_struct +2354 (irq_soft_mask)

Signed-off-by: Athira Rajeev 
---
 tools/perf/util/annotate-data.c | 30 --
 tools/perf/util/dwarf-aux.c |  1 +
 tools/perf/util/dwarf-aux.h |  1 +
 tools/perf/util/sort.c  |  7 +--
 4 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index ab2168c4ef41..9f72d4b6a5f4 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -267,23 +267,32 @@ static void delete_members(struct annotated_member 
*member)
 }
 
 static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
- Dwarf_Die *type_die)
+ Dwarf_Die *type_die, 
Dwarf_Global *global_die)
 {
struct annotated_data_type *result = NULL;
struct annotated_data_type key;
struct rb_node *node;
struct strbuf sb;
+   struct strbuf sb_var_name;
char *type_name;
+   char *var_name;
Dwarf_Word size;
 
strbuf_init(, 32);
+   strbuf_init(_var_name, 32);
if (die_get_typename_from_type(type_die, ) < 0)
strbuf_add(, "(unknown type)", 14);
+   if (global_die->name) {
+   strbuf_addstr(_var_name, global_die->name);
+   var_name = strbuf_detach(_var_name, NULL);
+   }
type_name = strbuf_detach(, NULL);
dwarf_aggregate_size(type_die, );
 
/* Check existing nodes in dso->data_types tree */
key.self.type_name = type_name;
+   if (global_die->name)
+   key.self.var_name = var_name;
key.self.size = size;
node = rb_find(, >data_types, data_type_cmp);
if (node) {
@@ -300,6 +309,8 @@ static struct annotated_data_type 
*dso__findnew_data_type(struct dso *dso,
}
 
result->self.type_name = type_name;
+   if (global_die->name)
+   result->self.var_name = var_name;
result->self.size = size;
INIT_LIST_HEAD(>self.children);
 
@@ -1177,7 +1188,7 @@ static int find_data_type_block(struct data_loc_info 
*dloc,
  * cu_die and match with reg to identify data type die.
  */
 static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, 
Dwarf_Die *cu_die,
-   Dwarf_Die *type_die)
+   Dwarf_Die *type_die, Dwarf_Global *global_die)
 {
Dwarf_Die vr_die;
int ret = -1;
@@ -1189,8 +1200,11 @@ static int find_data_type_global_reg(struct 
data_loc_info *dloc, int reg, Dwarf_
if (dwarf_offdie(dloc->di->dbg, var_types->die_off, 
_die)) {
if (die_get_real_type(_die, type_die) == 
NULL) {
dloc->type_offset = 0;
+   global_die->name = var_typ

[PATCH V2 8/9] tools/perf: Add support to find global register variables using find_data_type_global_reg

2024-05-06 Thread Athira Rajeev
There are cases where define a global register variable and associate it
with a specified register. Example, in powerpc, two registers are
defined to represent variable:
1. r13: represents local_paca
register struct paca_struct *local_paca asm("r13");

2. r1: represents stack_pointer
register void *__stack_pointer asm("r1");

These regs are present in dwarf debug as DW_OP_reg as part of variables
in the cu_die (compile unit). These are not present in die search done
in the list of nested scopes since these are global register variables.

Example for local_paca represented by r13:

<<>>
 <1><18dc6b4>: Abbrev Number: 128 (DW_TAG_variable)
<18dc6b6>   DW_AT_name: (indirect string, offset: 0x3861): 
local_paca
<18dc6ba>   DW_AT_decl_file   : 48
<18dc6bb>   DW_AT_decl_line   : 36
<18dc6bc>   DW_AT_decl_column : 30
<18dc6bd>   DW_AT_type: <0x18dc6c3>
<18dc6c1>   DW_AT_external: 1
<18dc6c1>   DW_AT_location: 1 byte block: 5d(DW_OP_reg13 (r13))

 <1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
<18dc6c4>   DW_AT_byte_size   : 8
<18dc6c4>   DW_AT_type: <0x18dc353>

Where  DW_AT_type : <0x18dc6c3> further points to :

 <1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
<18dc6c4>   DW_AT_byte_size   : 8
<18dc6c4>   DW_AT_type: <0x18dc353>

which belongs to:

 <1><18dc353>: Abbrev Number: 67 (DW_TAG_structure_type)
<18dc354>   DW_AT_name: (indirect string, offset: 0x56cd): 
paca_struct
<18dc358>   DW_AT_byte_size   : 2944
<18dc35a>   DW_AT_alignment   : 128
<18dc35b>   DW_AT_decl_file   : 48
<18dc35c>   DW_AT_decl_line   : 61
<18dc35d>   DW_AT_decl_column : 8
<18dc35d>   DW_AT_sibling : <0x18dc6b4>
<<>>

Similar is case with "r1".

<<>>
 <1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
<18dd774>   DW_AT_name: (indirect string, offset: 0x11ba): 
current_stack_pointer
<18dd778>   DW_AT_decl_file   : 51
<18dd779>   DW_AT_decl_line   : 1468
<18dd77b>   DW_AT_decl_column : 24
<18dd77c>   DW_AT_type: <0x18da5cd>
<18dd780>   DW_AT_external: 1
<18dd780>   DW_AT_location: 1 byte block: 51(DW_OP_reg1 (r1))

 where 18da5cd is:

 <1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
<18da5ce>   DW_AT_byte_size   : 8
<18da5cf>   DW_AT_encoding: 7   (unsigned)
<18da5d0>   DW_AT_name: (indirect string, offset: 0x55c7): long 
unsigned int
<<>>

To identify data type for these two special cases, iterate over
variables in the CU die (Compile Unit) and match it with the register.
If the variable is a base type, ie die_get_real_type will return NULL
here, set offset to zero. With the changes, data type for "paca_struct"
and "long unsigned int" for r1 is identified.

Snippet from ./perf report -s type,type_off

12.85%  long unsigned int  long unsigned int +0 (no field)
 4.68%  struct paca_struct  struct paca_struct +2312 (__current)
 4.57%  struct paca_struct  struct paca_struct +2354 (irq_soft_mask)

Signed-off-by: Athira Rajeev 
---
 tools/perf/util/annotate-data.c  | 40 
 tools/perf/util/annotate.c   |  8 ++
 tools/perf/util/annotate.h   |  1 +
 tools/perf/util/include/dwarf-regs.h |  1 +
 4 files changed, 50 insertions(+)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index e22ba35c93b2..ab2168c4ef41 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -1169,6 +1169,40 @@ static int find_data_type_block(struct data_loc_info 
*dloc,
return ret;
 }
 
+/*
+ * Handle cases where define a global register variable and
+ * associate it with a specified register. These regs are
+ * present in dwarf debug as DW_OP_reg as part of variables
+ * in the cu_die (compile unit). Iterate over variables in the
+ * cu_die and match with reg to identify data type die.
+ */
+static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, 
Dwarf_Die *cu_die,
+   Dwarf_Die *type_die)
+{
+   Dwarf_Die vr_die;
+   int ret = -1;
+   struct die_var_type *var_types = NULL;
+
+   die_collect_vars(cu_die, _types);
+   while (var_types) {
+   if (var_types->reg == reg) {
+   if (dwarf_offdie(dloc->di->dbg, var_types->die_off, 
_die)) {
+   if (die_get_real_type(_die, type_die) == 
NULL) {
+   dloc->type_offset = 0;
+   dwarf_o

[PATCH V2 7/9] tools/perf: Update instruction tracking with add instruction

2024-05-06 Thread Athira Rajeev
Update instruction tracking with add instruction. Apart from "mr"
instruction, the register state is carried on by other insns, ie,
"add, addi, addis". Since these are not memory instructions and doesn't
fall in the range of (32 to 63), add these as part of nmemonic table.
For now, add* instructions are added. There is possibility of getting
more added here. But to extract regs, still the binary code will be
used. So associate this with "load_store_ops" itself and no other
changes required.

Signed-off-by: Athira Rajeev 
---
 .../perf/arch/powerpc/annotate/instructions.c | 21 +++
 tools/perf/util/disasm.c  |  1 +
 2 files changed, 22 insertions(+)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c 
b/tools/perf/arch/powerpc/annotate/instructions.c
index cce7023951fe..1f35d8a65bb4 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -1,6 +1,17 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 
+/*
+ * powerpc instruction nmemonic table to associate load/store instructions with
+ * move_ops. mov_ops is used to identify add/mr to do instruction tracking.
+ */
+static struct ins powerpc__instructions[] = {
+   { .name = "mr", .ops = _store_ops,  },
+   { .name = "addi",   .ops = _store_ops,   },
+   { .name = "addis",  .ops = _store_ops,  },
+   { .name = "add",.ops = _store_ops,  },
+};
+
 static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, 
const char *name)
 {
int i;
@@ -75,6 +86,9 @@ static void update_insn_state_powerpc(struct type_state 
*state,
if (annotate_get_insn_location(dloc->arch, dl, ) < 0)
return;
 
+   if (!strncmp(dl->ins.name, "add", 3))
+   goto regs_check;
+
if (strncmp(dl->ins.name, "mr", 2))
return;
 
@@ -85,6 +99,7 @@ static void update_insn_state_powerpc(struct type_state 
*state,
dst->reg1 = src_reg;
}
 
+regs_check:
if (!has_reg_type(state, dst->reg1))
return;
 
@@ -115,6 +130,12 @@ static void update_insn_state_powerpc(struct type_state 
*state __maybe_unused, s
 static int powerpc__annotate_init(struct arch *arch, char *cpuid 
__maybe_unused)
 {
if (!arch->initialized) {
+   arch->nr_instructions = ARRAY_SIZE(powerpc__instructions);
+   arch->instructions = calloc(arch->nr_instructions, 
sizeof(struct ins));
+   if (!arch->instructions)
+   return -ENOMEM;
+   memcpy(arch->instructions, powerpc__instructions, sizeof(struct 
ins) * arch->nr_instructions);
+   arch->nr_instructions_allocated = arch->nr_instructions;
arch->initialized = true;
arch->associate_instruction_ops = 
powerpc__associate_instruction_ops;
arch->objdump.comment_char  = '#';
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index ac6b8b8da38a..32cf506a9010 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -36,6 +36,7 @@ static struct ins_ops mov_ops;
 static struct ins_ops nop_ops;
 static struct ins_ops lock_ops;
 static struct ins_ops ret_ops;
+static struct ins_ops load_store_ops;
 
 static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
   struct ins_operands *ops, int max_ins_name);
-- 
2.43.0



[PATCH V2 6/9] tools/perf: Update instruction tracking for powerpc

2024-05-06 Thread Athira Rajeev
Add instruction tracking function "update_insn_state_powerpc" for
powerpc. Example sequence in powerpc:

ld  r10,264(r3)
mr  r31,r3
<
ld  r9,312(r31)

Consider ithe sample is pointing to: "ld r9,312(r31)".
Here the memory reference is hit at "312(r31)" where 312 is the offset
and r31 is the source register. Previous instruction sequence shows that
register state of r3 is moved to r31. So to identify the data type for r31
access, the previous instruction ("mr") needs to be tracked and the
state type entry has to be updated. Current instruction tracking support
in perf tools infrastructure is specific to x86. Patch adds this for
powerpc and adds "mr" instruction to be tracked.

Signed-off-by: Athira Rajeev 
---
 .../perf/arch/powerpc/annotate/instructions.c | 63 +++
 tools/perf/util/annotate-data.c   |  9 ++-
 tools/perf/util/disasm.c  |  1 +
 3 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c 
b/tools/perf/arch/powerpc/annotate/instructions.c
index a3f423c27cae..cce7023951fe 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -49,6 +49,69 @@ static struct ins_ops 
*powerpc__associate_instruction_ops(struct arch *arch, con
return ops;
 }
 
+/*
+ * Instruction tracking function to track register state moves.
+ * Example sequence:
+ *ld  r10,264(r3)
+ *mr  r31,r3
+ *<
+ *ld  r9,312(r31)
+ *
+ * Previous instruction sequence shows that register state of r3
+ * is moved to r31. update_insn_state_powerpc tracks these state
+ * changes
+ */
+#ifdef HAVE_DWARF_SUPPORT
+static void update_insn_state_powerpc(struct type_state *state,
+   struct data_loc_info *dloc, Dwarf_Die *cu_die __maybe_unused,
+   struct disasm_line *dl)
+{
+   struct annotated_insn_loc loc;
+   struct annotated_op_loc *src = [INSN_OP_SOURCE];
+   struct annotated_op_loc *dst = [INSN_OP_TARGET];
+   struct type_state_reg *tsr;
+   u32 insn_offset = dl->al.offset;
+
+   if (annotate_get_insn_location(dloc->arch, dl, ) < 0)
+   return;
+
+   if (strncmp(dl->ins.name, "mr", 2))
+   return;
+
+   if (!strncmp(dl->ins.name, "mr", 2)) {
+   int src_reg = src->reg1;
+
+   src->reg1 = dst->reg1;
+   dst->reg1 = src_reg;
+   }
+
+   if (!has_reg_type(state, dst->reg1))
+   return;
+
+   tsr = >regs[dst->reg1];
+
+   if (!has_reg_type(state, src->reg1) ||
+   !state->regs[src->reg1].ok) {
+   tsr->ok = false;
+   return;
+   }
+
+   tsr->type = state->regs[src->reg1].type;
+   tsr->kind = state->regs[src->reg1].kind;
+   tsr->ok = true;
+
+   pr_debug("mov [%x] reg%d -> reg%d",
+   insn_offset, src->reg1, dst->reg1);
+   pr_debug_type_name(>type, tsr->kind);
+}
+#else /* HAVE_DWARF_SUPPORT */
+static void update_insn_state_powerpc(struct type_state *state __maybe_unused, 
struct data_loc_info *dloc __maybe_unused,
+   Dwarf_Die *cu_die __maybe_unused, struct disasm_line *dl 
__maybe_unused)
+{
+   return;
+}
+#endif /* HAVE_DWARF_SUPPORT */
+
 static int powerpc__annotate_init(struct arch *arch, char *cpuid 
__maybe_unused)
 {
if (!arch->initialized) {
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 9d6d4f472c85..e22ba35c93b2 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -1079,6 +1079,13 @@ static int find_data_type_insn(struct data_loc_info 
*dloc,
return ret;
 }
 
+static int arch_supports_insn_tracking(struct data_loc_info *dloc)
+{
+   if ((arch__is(dloc->arch, "x86")) || (arch__is(dloc->arch, "powerpc")))
+   return 1;
+   return 0;
+}
+
 /*
  * Construct a list of basic blocks for each scope with variables and try to 
find
  * the data type by updating a type state table through instructions.
@@ -1093,7 +1100,7 @@ static int find_data_type_block(struct data_loc_info 
*dloc,
int ret = -1;
 
/* TODO: other architecture support */
-   if (!arch__is(dloc->arch, "x86"))
+   if (!arch_supports_insn_tracking(dloc))
return -1;
 
prev_dst_ip = dst_ip = dloc->ip;
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index f41a0fadeab4..ac6b8b8da38a 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -151,6 +151,7 @@ static struct arch architectures[] = {
{
.name = "powerpc",
.init = powerpc__annotate_init,
+   .update_insn_state = update_insn_state_powerpc,
},
{
.name = "riscv64",
-- 
2.43.0



[PATCH V2 5/9] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc

2024-05-06 Thread Athira Rajeev
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset. The implementation addresses
the D-form, X-form, DS-form instructions. Two main functions are added.
New parse function "load_store__parse" as instruction ops parser for
memory instructions. Unlink other parser (like mov__parse), this parser
fills in only the "raw" field for source/target and new added "mem_ref"
field. This is because, here there is no need to parse the disassembled
code and arch specific macros will take care of extracting offset and
regs which is easier and will be precise.

In powerpc, all instructions with a primary opcode from 32 to 63
are memory instructions. Update "ins__find" function to have "opcode"
also as a parameter. Don't use the "extract_reg_offset", instead use
newly added function "get_arch_regs" which will set these fields: reg1,
reg2, offset depending of where it is source or target ops.

Signed-off-by: Athira Rajeev 
---
 tools/perf/arch/powerpc/util/dwarf-regs.c | 33 +
 tools/perf/util/annotate.c| 22 -
 tools/perf/util/disasm.c  | 59 +--
 tools/perf/util/disasm.h  |  4 +-
 tools/perf/util/include/dwarf-regs.h  |  4 +-
 5 files changed, 114 insertions(+), 8 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c 
b/tools/perf/arch/powerpc/util/dwarf-regs.c
index e60a71fd846e..3121c70dc0d3 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -102,6 +102,9 @@ int regs_query_register_offset(const char *name)
 #definePPC_OP(op)  (((op) >> 26) & 0x3F)
 #define PPC_RA(a)  (((a) >> 16) & 0x1f)
 #define PPC_RT(t)  (((t) >> 21) & 0x1f)
+#define PPC_RB(b)  (((b) >> 11) & 0x1f)
+#define PPC_D(D)   ((D) & 0xfffe)
+#define PPC_DS(DS) ((DS) & 0xfffc)
 
 int get_opcode_insn(unsigned int raw_insn)
 {
@@ -117,3 +120,33 @@ int get_target_reg(unsigned int raw_insn)
 {
return PPC_RT(raw_insn);
 }
+
+int get_offset_opcode(int raw_insn __maybe_unused)
+{
+   int opcode = PPC_OP(raw_insn);
+
+   /* DS- form */
+   if ((opcode == 58) || (opcode == 62))
+   return PPC_DS(raw_insn);
+   else
+   return PPC_D(raw_insn);
+}
+
+/*
+ * Fills the required fields for op_loc depending on if it
+ * is a source of target.
+ * D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT
+ * DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT
+ * X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT
+ */
+void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused, 
struct annotated_op_loc *op_loc __maybe_unused)
+{
+   if (is_source)
+   op_loc->reg1 = get_source_reg(raw_insn);
+   else
+   op_loc->reg1 = get_target_reg(raw_insn);
+   if (op_loc->multi_regs)
+   op_loc->reg2 = PPC_RB(raw_insn);
+   if (op_loc->mem_ref)
+   op_loc->offset = get_offset_opcode(raw_insn);
+}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 0f5e10654d09..48739c7ffdc7 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2073,6 +2073,12 @@ static int extract_reg_offset(struct arch *arch, const 
char *str,
return 0;
 }
 
+__weak void get_arch_regs(int raw_insn __maybe_unused, int is_source 
__maybe_unused,
+   struct annotated_op_loc *op_loc __maybe_unused)
+{
+   return;
+}
+
 /**
  * annotate_get_insn_location - Get location of instruction
  * @arch: the architecture info
@@ -2117,10 +2123,12 @@ int annotate_get_insn_location(struct arch *arch, 
struct disasm_line *dl,
for_each_insn_op_loc(loc, i, op_loc) {
const char *insn_str = ops->source.raw;
bool multi_regs = ops->source.multi_regs;
+   bool mem_ref = ops->source.mem_ref;
 
if (i == INSN_OP_TARGET) {
insn_str = ops->target.raw;
multi_regs = ops->target.multi_regs;
+   mem_ref = ops->target.mem_ref;
}
 
/* Invalidate the register by default */
@@ -2130,7 +2138,19 @@ int annotate_get_insn_location(struct arch *arch, struct 
disasm_line *dl,
if (insn_str == NULL)
continue;
 
-   if (strchr(insn_str, arch->objdump.memory_ref_char)) {
+   /*
+* For powerpc, call get_arch_regs function which extracts the
+* required fields for op_loc, ie reg1, reg2, offset from the
+* raw instruction.
+*/
+   if (arch__is(arch, "powerpc")) {
+  

[PATCH V2 2/9] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking

2024-05-06 Thread Athira Rajeev
Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".
Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch//annotate/instructions.c and make changes/updates
easier.

Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.

Signed-off-by: Athira Rajeev 
---
 tools/perf/arch/x86/annotate/instructions.c | 383 +++
 tools/perf/util/annotate-data.c | 391 +---
 tools/perf/util/annotate-data.h |  23 ++
 tools/perf/util/disasm.c|   2 +
 tools/perf/util/disasm.h|   7 +
 5 files changed, 423 insertions(+), 383 deletions(-)

diff --git a/tools/perf/arch/x86/annotate/instructions.c 
b/tools/perf/arch/x86/annotate/instructions.c
index 5cdf457f5cbe..cd2fa59a8034 100644
--- a/tools/perf/arch/x86/annotate/instructions.c
+++ b/tools/perf/arch/x86/annotate/instructions.c
@@ -206,3 +206,386 @@ static int x86__annotate_init(struct arch *arch, char 
*cpuid)
arch->initialized = true;
return err;
 }
+
+#ifdef HAVE_DWARF_SUPPORT
+static void update_insn_state_x86(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die *cu_die,
+ struct disasm_line *dl)
+{
+   struct annotated_insn_loc loc;
+   struct annotated_op_loc *src = [INSN_OP_SOURCE];
+   struct annotated_op_loc *dst = [INSN_OP_TARGET];
+   struct type_state_reg *tsr;
+   Dwarf_Die type_die;
+   u32 insn_offset = dl->al.offset;
+   int fbreg = dloc->fbreg;
+   int fboff = 0;
+
+   if (annotate_get_insn_location(dloc->arch, dl, ) < 0)
+   return;
+
+   if (ins__is_call(>ins)) {
+   struct symbol *func = dl->ops.target.sym;
+
+   if (func == NULL)
+   return;
+
+   /* __fentry__ will preserve all registers */
+   if (!strcmp(func->name, "__fentry__"))
+   return;
+
+   pr_debug_dtp("call [%x] %s\n", insn_offset, func->name);
+
+   /* Otherwise invalidate caller-saved registers after call */
+   for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
+   if (state->regs[i].caller_saved)
+   state->regs[i].ok = false;
+   }
+
+   /* Update register with the return type (if any) */
+   if (die_find_func_rettype(cu_die, func->name, _die)) {
+   tsr = >regs[state->ret_reg];
+   tsr->type = type_die;
+   tsr->kind = TSR_KIND_TYPE;
+   tsr->ok = true;
+
+   pr_debug_dtp("call [%x] return -> reg%d",
+insn_offset, state->ret_reg);
+   pr_debug_type_name(_die, tsr->kind);
+   }
+   return;
+   }
+
+   if (!strncmp(dl->ins.name, "add", 3)) {
+   u64 imm_value = -1ULL;
+   int offset;
+   const char *var_name = NULL;
+   struct map_symbol *ms = dloc->ms;
+   u64 ip = ms->sym->start + dl->al.offset;
+
+   if (!has_reg_type(state, dst->reg1))
+   return;
+
+   tsr = >regs[dst->reg1];
+
+   if (src->imm)
+   imm_value = src->offset;
+   else if (has_reg_type(state, src->reg1) &&
+state->regs[src->reg1].kind == TSR_KIND_CONST)
+   imm_value = state->regs[src->reg1].imm_value;
+   else if (src->reg1 == DWARF_REG_PC) {
+   u64 var_addr = annotate_calc_pcrel(dloc->ms, ip,
+  src->offset, dl);
+
+   if (get_global_var_info(dloc, var_addr,
+   _name, ) &&
+   !strcmp(var_name, "this_cpu_off") &&
+   tsr->kind == TSR_KIND_CONST) {
+   tsr->kind = TSR_KIND_PERCPU_BASE;
+   imm_value = tsr->imm_value;
+   }
+   }
+   else
+   return;
+
+   if (tsr->kind != TSR_KIND_PERCPU_

[PATCH V2 3/9] tools/perf: Fix a comment about multi_regs in extract_reg_offset function

2024-05-06 Thread Athira Rajeev
Fix a comment in function which explains how multi_regs field gets set
for an instruction. In the example, "mov  %rsi, 8(%rbx,%rcx,4)", the
comment mistakenly referred to "dst_multi_regs = 0". Correct it to use
"src_multi_regs = 0"

Signed-off-by: Athira Rajeev 
---
 tools/perf/util/annotate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index f5b6b5e5e757..0f5e10654d09 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2093,7 +2093,7 @@ static int extract_reg_offset(struct arch *arch, const 
char *str,
  *   mov  0x18, %r8  # src_reg1 = -1, src_mem = 0
  *   # dst_reg1 = r8, dst_mem = 0
  *
- *   mov  %rsi, 8(%rbx,%rcx,4)  # src_reg1 = rsi, src_mem = 0, dst_multi_regs 
= 0
+ *   mov  %rsi, 8(%rbx,%rcx,4)  # src_reg1 = rsi, src_mem = 0, src_multi_regs 
= 0
  *  # dst_reg1 = rbx, dst_reg2 = rcx, dst_mem = 1
  *  # dst_multi_regs = 1, dst_offset = 8
  */
-- 
2.43.0



[PATCH V2 1/9] tools/perf: Move the data structures related to register type to header file

2024-05-06 Thread Athira Rajeev
Data type profiling uses instruction tracking by checking each
instruction and updating the register type state in some data
structures. This is useful to find the data type in cases when the
register state gets transferred from one reg to another. Example, in
x86, "mov" instruction and in powerpc, "mr" instruction. Currently these
structures are defined in annotate-data.c and instruction tracking is
implemented only for x86. Move these data structures to
"annotate-data.h" header file so that other arch implementations can use
it in arch specific files as well.

Signed-off-by: Athira Rajeev 
---
 tools/perf/util/annotate-data.c | 53 +--
 tools/perf/util/annotate-data.h | 55 +
 2 files changed, 56 insertions(+), 52 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 2c98813f95cd..e812dec09c99 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -30,15 +30,6 @@
 
 static void delete_var_types(struct die_var_type *var_types);
 
-enum type_state_kind {
-   TSR_KIND_INVALID = 0,
-   TSR_KIND_TYPE,
-   TSR_KIND_PERCPU_BASE,
-   TSR_KIND_CONST,
-   TSR_KIND_POINTER,
-   TSR_KIND_CANARY,
-};
-
 #define pr_debug_dtp(fmt, ...) \
 do {   \
if (debug_type_profile) \
@@ -139,49 +130,7 @@ static void pr_debug_location(Dwarf_Die *die, u64 pc, int 
reg)
}
 }
 
-/*
- * Type information in a register, valid when @ok is true.
- * The @caller_saved registers are invalidated after a function call.
- */
-struct type_state_reg {
-   Dwarf_Die type;
-   u32 imm_value;
-   bool ok;
-   bool caller_saved;
-   u8 kind;
-};
-
-/* Type information in a stack location, dynamically allocated */
-struct type_state_stack {
-   struct list_head list;
-   Dwarf_Die type;
-   int offset;
-   int size;
-   bool compound;
-   u8 kind;
-};
-
-/* FIXME: This should be arch-dependent */
-#define TYPE_STATE_MAX_REGS  16
-
-/*
- * State table to maintain type info in each register and stack location.
- * It'll be updated when new variable is allocated or type info is moved
- * to a new location (register or stack).  As it'd be used with the
- * shortest path of basic blocks, it only maintains a single table.
- */
-struct type_state {
-   /* state of general purpose registers */
-   struct type_state_reg regs[TYPE_STATE_MAX_REGS];
-   /* state of stack location */
-   struct list_head stack_vars;
-   /* return value register */
-   int ret_reg;
-   /* stack pointer register */
-   int stack_reg;
-};
-
-static bool has_reg_type(struct type_state *state, int reg)
+bool has_reg_type(struct type_state *state, int reg)
 {
return (unsigned)reg < ARRAY_SIZE(state->regs);
 }
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 0a57d9f5ee78..ef235b1b15e1 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -6,6 +6,9 @@
 #include 
 #include 
 #include 
+#include "dwarf-aux.h"
+#include "annotate.h"
+#include "debuginfo.h"
 
 struct annotated_op_loc;
 struct debuginfo;
@@ -15,6 +18,15 @@ struct hist_entry;
 struct map_symbol;
 struct thread;
 
+enum type_state_kind {
+   TSR_KIND_INVALID = 0,
+   TSR_KIND_TYPE,
+   TSR_KIND_PERCPU_BASE,
+   TSR_KIND_CONST,
+   TSR_KIND_POINTER,
+   TSR_KIND_CANARY,
+};
+
 /**
  * struct annotated_member - Type of member field
  * @node: List entry in the parent list
@@ -142,6 +154,48 @@ struct annotated_data_stat {
 };
 extern struct annotated_data_stat ann_data_stat;
 
+/*
+ * Type information in a register, valid when @ok is true.
+ * The @caller_saved registers are invalidated after a function call.
+ */
+struct type_state_reg {
+   Dwarf_Die type;
+   u32 imm_value;
+   bool ok;
+   bool caller_saved;
+   u8 kind;
+};
+
+/* Type information in a stack location, dynamically allocated */
+struct type_state_stack {
+   struct list_head list;
+   Dwarf_Die type;
+   int offset;
+   int size;
+   bool compound;
+   u8 kind;
+};
+
+/* FIXME: This should be arch-dependent */
+#define TYPE_STATE_MAX_REGS  32
+
+/*
+ * State table to maintain type info in each register and stack location.
+ * It'll be updated when new variable is allocated or type info is moved
+ * to a new location (register or stack).  As it'd be used with the
+ * shortest path of basic blocks, it only maintains a single table.
+ */
+struct type_state {
+   /* state of general purpose registers */
+   struct type_state_reg regs[TYPE_STATE_MAX_REGS];
+   /* state of stack location */
+   struct list_head stack_vars;
+   /* return value register */
+   int r

[PATCH V2 0/9] Add data type profiling support for powerpc

2024-05-06 Thread Athira Rajeev
03%  struct pt_regs  struct pt_regs +264 (user_regs.msr)
 1.00%  struct menu_device  struct menu_device +4 (tick_wakeup)
 0.90%  struct security_hook_list  struct security_hook_list +0 (list.next)
     0.76%  struct irq_desc  struct irq_desc +304 (irq_data.chip)
 0.76%  struct rq  struct rq +2856 (cpu)

Thanks
Athira Rajeev

Changelog:
>From v1->v2:
- Addressed suggestion from Christophe Leroy and Segher Boessenkool
  to use the binary code (raw insn) to fetch opcode, register and
  offset fields.
- Added support for instruction tracking in powerpc
- Find the register defined variables (r13 and r1 which points to
  local_paca and current_stack_pointer in powerpc)

Athira Rajeev (9):
  tools/perf: Move the data structures related to register  type to
header file
  tools/perf: Add "update_insn_state" callback function to handle arch
specific instruction tracking
  tools/perf: Fix a comment about multi_regs in extract_reg_offset
function
  tools/perf: Add support to capture and parse raw instruction in
objdump
  tools/perf: Update parameters for reg extract functions to use raw
instruction on powerpc
  tools/perf: Update instruction tracking for powerpc
  tools/perf: Update instruction tracking with add instruction
  tools/perf: Add support to find global register variables using
find_data_type_global_reg
  tools/perf: Add support for global_die to capture name of variable in
case of register defined variable

 tools/include/linux/string.h  |   2 +
 tools/lib/string.c|  13 +
 .../perf/arch/powerpc/annotate/instructions.c |  84 +++
 tools/perf/arch/powerpc/util/dwarf-regs.c |  52 ++
 tools/perf/arch/x86/annotate/instructions.c   | 383 +
 tools/perf/util/annotate-data.c   | 519 +++---
 tools/perf/util/annotate-data.h   |  78 +++
 tools/perf/util/annotate.c|  32 +-
 tools/perf/util/annotate.h|   1 +
 tools/perf/util/disasm.c  | 109 +++-
 tools/perf/util/disasm.h  |  17 +-
 tools/perf/util/dwarf-aux.c   |   1 +
 tools/perf/util/dwarf-aux.h   |   1 +
 tools/perf/util/include/dwarf-regs.h  |  12 +
 tools/perf/util/sort.c|   7 +-
 15 files changed, 854 insertions(+), 457 deletions(-)

-- 
2.43.0



Re: [PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc

2024-03-18 Thread Athira Rajeev



> On 09-Mar-2024, at 11:13 PM, Segher Boessenkool  
> wrote:
> 
> All instructions with a primary opcode from 32 to 63 are memory insns,
> and no others.  It's trivial to see whether it is a load or store, too
> (just bit 3 of the insn).  Trying to parse disassembled code is much
> harder, and you easily make some mistakes here.

Hi Segher

Thanks for checking the patch and sharing review comments.

Ok, I am checking on this part.

> 
> On Sat, Mar 09, 2024 at 12:55:12PM +0530, Athira Rajeev wrote:
>> To identify if the instruction has any memory reference,
>> "memory_ref_char" field needs to be set for specific architecture.
>> Example memory instruction:
>> lwz r10,0(r9)
>> 
>> Here "(" is the memory_ref_char. Set this as part of arch->objdump
> 
> What about "lwzx r10,0,r19", semantically exactly the same instruction?
> There is no paren in there.  Not all instructions using parens are
> memory insns, either, not in assembler code at least.
Yes, right Segher.

So, for the basic foundational patches, I targeted for instructions of this 
form (D form)
There are still samples, which comes as unknown and in that, X form 
instructions also needs to be checked.
Targeted to first get these basic foundational patches to add support in 
powerpc and get the remaining “unknowns” addressed in follow up.
But yes, X-form instructions also will be covered as part of the changes needed 
for powerpc.

> 
>> To get register number and access offset from the given instruction,
>> arch->objdump.register_char is used. In case of powerpc, the register
>> char is "r" (since reg names are r0 to r31). Hence set register_char
>> as "r".
> 
> cr0..cr7
> r0..r31
> f0..f31
> v0..v31
> vs0..vs63
> and many other spellings.  Plain "0..63" is also fine.
Ok 
> 
> The "0" in my lwzx example is a register field as well (and it stands
> for "no register", not for "r0").  Called "(RA|0)" usually (incidentally,
> see the parens there as well, oh joy).
> 
> Don't you have the binary code here as well, not just a disassembled
> representation of it?  It is way easier to just use that, and you'll get
> much better results that way :-)
> 

Thanks Segher for the suggestion on this. I will check on this as well.

Thanks
Athira Rajeev

> 
> Segher



Re: [PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling

2024-03-18 Thread Athira Rajeev



> On 09-Mar-2024, at 3:18 PM, Christophe Leroy  
> wrote:
> 
> 
> 
> Le 09/03/2024 à 08:25, Athira Rajeev a écrit :
>> Add powerpc instruction nmemonic table to associate load/store
>> instructions with move_ops. mov_ops is used to identify mem_type
>> to associate instruction with data type and offset. Also initialize
>> and allocate arch specific fields for nr_instructions, instructions and
>> nr_instructions_allocate.
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>>  .../perf/arch/powerpc/annotate/instructions.c | 66 +++
>>  1 file changed, 66 insertions(+)
>> 
>> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c 
>> b/tools/perf/arch/powerpc/annotate/instructions.c
>> index a3f423c27cae..07af4442be38 100644
>> --- a/tools/perf/arch/powerpc/annotate/instructions.c
>> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
>> @@ -1,6 +1,65 @@
>>  // SPDX-License-Identifier: GPL-2.0
>>  #include 
>> 
>> +/*
>> + * powerpc instruction nmemonic table to associate load/store instructions 
>> with
>> + * move_ops. mov_ops is used to identify mem_type to associate instruction 
>> with
>> + * data type and offset.
>> + */
>> +static struct ins powerpc__instructions[] = {
>> + { .name = "lbz", .ops = _ops,  },
>> + { .name = "lbzx", .ops = _ops,  },
>> + { .name = "lbzu", .ops = _ops,  },
>> + { .name = "lbzux", .ops = _ops,  },
>> + { .name = "lhz", .ops = _ops,  },
>> + { .name = "lhzx", .ops = _ops,  },
>> + { .name = "lhzu", .ops = _ops,  },
>> + { .name = "lhzux", .ops = _ops,  },
>> + { .name = "lha", .ops = _ops,  },
>> + { .name = "lhax", .ops = _ops,  },
>> + { .name = "lhau", .ops = _ops,  },
>> + { .name = "lhaux", .ops = _ops,  },
>> + { .name = "lwz", .ops = _ops,  },
>> + { .name = "lwzx", .ops = _ops,  },
>> + { .name = "lwzu", .ops = _ops,  },
>> + { .name = "lwzux", .ops = _ops,  },
>> + { .name = "lwa", .ops = _ops,  },
>> + { .name = "lwax", .ops = _ops,  },
>> + { .name = "lwaux", .ops = _ops,  },
>> + { .name = "ld", .ops = _ops,  },
>> + { .name = "ldx", .ops = _ops,  },
>> + { .name = "ldu", .ops = _ops,  },
>> + { .name = "ldux", .ops = _ops,  },
>> + { .name = "stb", .ops = _ops,  },
>> + { .name = "stbx", .ops = _ops,  },
>> + { .name = "stbu", .ops = _ops,  },
>> + { .name = "stbux", .ops = _ops,  },
>> + { .name = "sth", .ops = _ops,  },
>> + { .name = "sthx", .ops = _ops,  },
>> + { .name = "sthu", .ops = _ops,  },
>> + { .name = "sthux", .ops = _ops,  },
>> + { .name = "stw", .ops = _ops,  },
>> + { .name = "stwx", .ops = _ops,  },
>> + { .name = "stwu", .ops = _ops,  },
>> + { .name = "stwux", .ops = _ops,  },
>> + { .name = "std", .ops = _ops,  },
>> + { .name = "stdx", .ops = _ops,  },
>> + { .name = "stdu", .ops = _ops,  },
>> + { .name = "stdux", .ops = _ops,  },
>> + { .name = "lhbrx", .ops = _ops,  },
>> + { .name = "sthbrx", .ops = _ops,  },
>> + { .name = "lwbrx", .ops = _ops,  },
>> + { .name = "stwbrx", .ops = _ops,  },
>> + { .name = "ldbrx", .ops = _ops,  },
>> + { .name = "stdbrx", .ops = _ops,  },
>> + { .name = "lmw", .ops = _ops,  },
>> + { .name = "stmw", .ops = _ops,  },
>> + { .name = "lswi", .ops = _ops,  },
>> + { .name = "lswx", .ops = _ops,  },
>> + { .name = "stswi", .ops = _ops,  },
>> + { .name = "stswx", .ops = _ops,  },
>> +};
> 
> What about lwarx and stwcx ?
Yes, Will add those in next version

> 
>> +
>>  static struct ins_ops *powerpc__associate_instruction_ops(struct arch 
>> *arch, const char *name)
>>  {
>>   int i;
>> @@ -52,6 +111,13 @@ static struct ins_ops 
>> *powerpc__associate_instruction_ops(struct arch *arch, con
>>  static int powerpc__annotate_init(struct arch *arch, char *cpuid 
>> __maybe_unused)
>>  {
>>   if (!arch->initialized) {
>> + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions);
>> + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins));
>> + if (arch->instructions == NULL)
> 
> Prefered form is
> 
> if (!arch->instructions)
Ok , will make this change

> 
>> + return -ENOMEM;
>> +
>> + memcpy(arch->instructions, (struct ins *)powerpc__instructions, 
>> sizeof(struct ins) * arch->nr_instructions);
> 
> No need to cast powerpc__instructions, it is already a pointer.
Yes, I will correct it


Thanks
Athira Rajeev
> 
> 
>> + arch->nr_instructions_allocated = arch->nr_instructions;
>>   arch->initialized = true;
>>   arch->associate_instruction_ops = powerpc__associate_instruction_ops;
>>   arch->objdump.comment_char  = '#';




Re: [PATCH 3/3] tools/perf/arch/powerc: Add get_arch_regnum for powerpc

2024-03-18 Thread Athira Rajeev



> On 09-Mar-2024, at 3:24 PM, Christophe Leroy  
> wrote:
> 
> 
> 
> Le 09/03/2024 à 08:25, Athira Rajeev a écrit :
>> The function get_dwarf_regnum() returns a DWARF register number
>> from a register name string. This calls arch specific function
>> get_arch_regnum to return register number for corresponding arch.
>> Add mappings for register name to register number in powerpc code:
>> arch/powerpc/util/dwarf-regs.c
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>>  tools/perf/arch/powerpc/util/dwarf-regs.c | 29 +++
>>  1 file changed, 29 insertions(+)
>> 
>> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c 
>> b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> index 0c4f4caf53ac..d955e3e577ea 100644
>> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
>> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> @@ -98,3 +98,32 @@ int regs_query_register_offset(const char *name)
>>   return roff->ptregs_offset;
>>   return -EINVAL;
>>  }
>> +
>> +struct dwarf_regs_idx {
>> + const char *name;
>> + int idx;
>> +};
>> +
>> +static const struct dwarf_regs_idx powerpc_regidx_table[] = {
>> + { "r0", 0 }, { "r1", 1 }, { "r2", 2 }, { "r3", 3 }, { "r4", 4 },
>> + { "r5", 5 }, { "r6", 6 }, { "r7", 7 }, { "r8", 8 }, { "r9", 9 },
>> + { "r10", 10 }, { "r11", 11 }, { "r12", 12 }, { "r13", 13 }, { "r14", 14 },
>> + { "r15", 15 }, { "r16", 16 }, { "r17", 17 }, { "r18", 18 }, { "r19", 19 },
>> + { "r20", 20 }, { "r21", 21 }, { "r22", 22 }, { "r23", 23 }, { "r24", 24 },
>> + { "r25", 25 }, { "r26", 26 }, { "r27", 27 }, { "r27", 27 }, { "r28", 28 },
>> + { "r29", 29 }, { "r30", 30 }, { "r31", 31 },
>> +};
>> +
>> +int get_arch_regnum(const char *name)
>> +{
>> + unsigned int i;
>> +
>> + if (*name != 'r')
>> + return -EINVAL;
>> +
>> + for (i = 0; i < ARRAY_SIZE(powerpc_regidx_table); i++)
>> + if (!strcmp(powerpc_regidx_table[i].name, name))
>> + return powerpc_regidx_table[i].idx;
> 
> Can you do more simple ?
> 
> Something like:
> 
> int n;
> 
> if (*name != 'r')
> return -EINVAL;
> n = atoi(name + 1);
> return n >= 0 && n < 32 ? n : -ENOENT;

Hi Christophe,

Thanks for reviewing patch and for the suggestions.

Sure, I will check this approach and address in V2

Thanks
Athira
> 
>> +
>> + return -ENOENT;
>> +}



[PATCH 3/3] tools/perf/arch/powerc: Add get_arch_regnum for powerpc

2024-03-08 Thread Athira Rajeev
The function get_dwarf_regnum() returns a DWARF register number
from a register name string. This calls arch specific function
get_arch_regnum to return register number for corresponding arch.
Add mappings for register name to register number in powerpc code:
arch/powerpc/util/dwarf-regs.c

Signed-off-by: Athira Rajeev 
---
 tools/perf/arch/powerpc/util/dwarf-regs.c | 29 +++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c 
b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 0c4f4caf53ac..d955e3e577ea 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -98,3 +98,32 @@ int regs_query_register_offset(const char *name)
return roff->ptregs_offset;
return -EINVAL;
 }
+
+struct dwarf_regs_idx {
+   const char *name;
+   int idx;
+};
+
+static const struct dwarf_regs_idx powerpc_regidx_table[] = {
+   { "r0", 0 }, { "r1", 1 }, { "r2", 2 }, { "r3", 3 }, { "r4", 4 },
+   { "r5", 5 }, { "r6", 6 }, { "r7", 7 }, { "r8", 8 }, { "r9", 9 },
+   { "r10", 10 }, { "r11", 11 }, { "r12", 12 }, { "r13", 13 }, { "r14", 14 
},
+   { "r15", 15 }, { "r16", 16 }, { "r17", 17 }, { "r18", 18 }, { "r19", 19 
},
+   { "r20", 20 }, { "r21", 21 }, { "r22", 22 }, { "r23", 23 }, { "r24", 24 
},
+   { "r25", 25 }, { "r26", 26 }, { "r27", 27 }, { "r27", 27 }, { "r28", 28 
},
+   { "r29", 29 }, { "r30", 30 }, { "r31", 31 },
+};
+
+int get_arch_regnum(const char *name)
+{
+   unsigned int i;
+
+   if (*name != 'r')
+   return -EINVAL;
+
+   for (i = 0; i < ARRAY_SIZE(powerpc_regidx_table); i++)
+   if (!strcmp(powerpc_regidx_table[i].name, name))
+   return powerpc_regidx_table[i].idx;
+
+   return -ENOENT;
+}
-- 
2.43.0



[PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc

2024-03-08 Thread Athira Rajeev
To identify if the instruction has any memory reference,
"memory_ref_char" field needs to be set for specific architecture.
Example memory instruction:
lwz r10,0(r9)

Here "(" is the memory_ref_char. Set this as part of arch->objdump

To get register number and access offset from the given instruction,
arch->objdump.register_char is used. In case of powerpc, the register
char is "r" (since reg names are r0 to r31). Hence set register_char
as "r".

Signed-off-by: Athira Rajeev 
---
 tools/perf/util/annotate.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ac002d907d81..d69bd6edafcb 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -216,6 +216,11 @@ static struct arch architectures[] = {
{
.name = "powerpc",
.init = powerpc__annotate_init,
+   .objdump =  {
+   .comment_char = '#',
+   .register_char = 'r',
+   .memory_ref_char = '(',
+   },
},
{
.name = "riscv64",
-- 
2.43.0



[PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling

2024-03-08 Thread Athira Rajeev
Add powerpc instruction nmemonic table to associate load/store
instructions with move_ops. mov_ops is used to identify mem_type
to associate instruction with data type and offset. Also initialize
and allocate arch specific fields for nr_instructions, instructions and
nr_instructions_allocate.

Signed-off-by: Athira Rajeev 
---
 .../perf/arch/powerpc/annotate/instructions.c | 66 +++
 1 file changed, 66 insertions(+)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c 
b/tools/perf/arch/powerpc/annotate/instructions.c
index a3f423c27cae..07af4442be38 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -1,6 +1,65 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 
+/*
+ * powerpc instruction nmemonic table to associate load/store instructions with
+ * move_ops. mov_ops is used to identify mem_type to associate instruction with
+ * data type and offset.
+ */
+static struct ins powerpc__instructions[] = {
+   { .name = "lbz",.ops = _ops,  },
+   { .name = "lbzx",   .ops = _ops,  },
+   { .name = "lbzu",   .ops = _ops,  },
+   { .name = "lbzux",  .ops = _ops,  },
+   { .name = "lhz",.ops = _ops,  },
+   { .name = "lhzx",   .ops = _ops,  },
+   { .name = "lhzu",   .ops = _ops,  },
+   { .name = "lhzux",  .ops = _ops,  },
+   { .name = "lha",.ops = _ops,  },
+   { .name = "lhax",   .ops = _ops,  },
+   { .name = "lhau",   .ops = _ops,  },
+   { .name = "lhaux",  .ops = _ops,  },
+   { .name = "lwz",.ops = _ops,  },
+   { .name = "lwzx",   .ops = _ops,  },
+   { .name = "lwzu",   .ops = _ops,  },
+   { .name = "lwzux",  .ops = _ops,  },
+   { .name = "lwa",.ops = _ops,  },
+   { .name = "lwax",   .ops = _ops,  },
+   { .name = "lwaux",  .ops = _ops,  },
+   { .name = "ld", .ops = _ops,  },
+   { .name = "ldx",.ops = _ops,  },
+   { .name = "ldu",.ops = _ops,  },
+   { .name = "ldux",   .ops = _ops,  },
+   { .name = "stb",.ops = _ops,  },
+   { .name = "stbx",   .ops = _ops,  },
+   { .name = "stbu",   .ops = _ops,  },
+   { .name = "stbux",  .ops = _ops,  },
+   { .name = "sth",.ops = _ops,  },
+   { .name = "sthx",   .ops = _ops,  },
+   { .name = "sthu",   .ops = _ops,  },
+   { .name = "sthux",  .ops = _ops,  },
+   { .name = "stw",.ops = _ops,  },
+   { .name = "stwx",   .ops = _ops,  },
+   { .name = "stwu",   .ops = _ops,  },
+   { .name = "stwux",  .ops = _ops,  },
+   { .name = "std",.ops = _ops,  },
+   { .name = "stdx",   .ops = _ops,  },
+   { .name = "stdu",   .ops = _ops,  },
+   { .name = "stdux",  .ops = _ops,  },
+   { .name = "lhbrx",  .ops = _ops,  },
+   { .name = "sthbrx", .ops = _ops,  },
+   { .name = "lwbrx",  .ops = _ops,  },
+   { .name = "stwbrx", .ops = _ops,  },
+   { .name = "ldbrx",  .ops = _ops,  },
+   { .name = "stdbrx", .ops = _ops,  },
+   { .name = "lmw",.ops = _ops,  },
+   { .name = "stmw",   .ops = _ops,  },
+   { .name = "lswi",   .ops = _ops,  },
+   { .name = "lswx",   .ops = _ops,  },
+   { .name = "stswi",  .ops = _ops,  },
+   { .name = "stswx",  .ops = _ops,  },
+};
+
 static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, 
const char *name)
 {
int i;
@@ -52,6 +111,13 @@ static struct ins_ops 
*powerpc__associate_instruction_ops(struct arch *arch, con
 static int powerpc__annotate_init(struct arch *arch, char *cpuid 
__maybe_unused)
 {
if (!arch->initialized) {
+   arch->nr_instructions = ARRAY_SIZE(powerpc__instructions);
+   arch->instructions = calloc(arch->nr_instructions, 
sizeof(struct ins));
+   if (arch->instructions == NULL)
+   return -ENOMEM;
+
+   memcpy(arch->instructions, (struct ins *)powerpc__instructions, 
sizeof(struct ins) * arch->nr_instructions);
+   arch->nr_instructions_allocated = arch->nr_instructions;
arch->initialized = true;
arch->associate_instruction_ops = 
powerpc__associate_instruction_ops;
arch->objdump.comment_char  = '#';
-- 
2.43.0



[PATCH 0/3] Add data type profiling support for powerpc

2024-03-08 Thread Athira Rajeev
The patchset from Namhyung added support for data type profiling
in perf tool. This enabled support to associate PMU samples to data
types they refer using DWARF debug information. With the upstream
perf, currently it possible to run perf report or perf annotate to
view the data type information on x86.

This patchset includes changes need to enable data type profiling
support for powerpc. Main change are:
1. powerpc instruction nmemonic table to associate load/store
instructions with move_ops which is use to identify if instruction
is a memory access one.
2. To get register number and access offset from the given
instruction, code uses fields from "struct arch" -> objump.
Add entry for powerpc here.
3. A get_arch_regnum to return register number from the
register name string.

These three patches are the basic foundational patches.
With these changes, data types is getting identified for kernel
and user-space samples. There are still samples, which comes as
"unknown" and needs to be checked. We plan to get those addressed
in follow up. With the current patchset:

 ./perf record -a -e mem-loads sleep 1
 ./perf report -s type,typeoff --hierarchy --group --stdio
Snippet of logs:

 Total Lost Samples: 0

 Samples: 277  of events 'mem-loads, dummy:u'
 Event count (approx.): 149813

Overhead  Data Type / Data Type Offset
 ...  

65.93%   0.00% (unknown)
   65.93%   0.00% (unknown) +0 (no field)
 8.19%   0.00% struct vm_area_struct
8.19%   0.00% struct vm_area_struct +136 (vm_file)
 4.53%   0.00% struct rq
3.14%   0.00% struct rq +0 (__lock.raw_lock.val)
0.83%   0.00% struct rq +3216 (avg_irq.runnable_sum)
0.24%   0.00% struct rq +4 (nr_running)
0.14%   0.00% struct rq +12 (nr_preferred_running)
0.12%   0.00% struct rq +2760 (sd)
0.06%   0.00% struct rq +3368 (prev_steal_time_rq)
0.01%   0.00% struct rq +2592 (curr)
 3.53%   0.00% struct rb_node
3.53%   0.00% struct rb_node +0 (__rb_parent_color)
 3.43%   0.00% struct slab
3.43%   0.00% struct slab +32 (freelist)
 3.30%   0.00% unsigned int
3.30%   0.00% unsigned int +0 (no field)
 3.22%   0.00% struct vm_fault
3.22%   0.00% struct vm_fault +48 (pmd)
 2.55%   0.00% unsigned char
2.55%   0.00% unsigned char +0 (no field)
 1.06%   0.00% struct task_struct
1.06%   0.00% struct task_struct +4 (thread_info.cpu)
 0.92%   0.00% void*
0.92%   0.00% void* +0 (no field)
 0.74%   0.00% __int128 unsigned
0.74%   0.00% __int128 unsigned +8 (no field)
 0.59%   0.00% struct perf_event
0.54%   0.00% struct perf_event +552 (ctx)
0.04%   0.00% struct perf_event +152 (pmu)
 0.20%   0.00% struct sched_entity
0.20%   0.00% struct sched_entity +0 (load.weight)
 0.18%   0.00% struct cfs_rq
0.18%   0.00% struct cfs_rq +96 (curr)

Thanks
Athira

Athira Rajeev (3):
  tools/perf/arch/powerpc: Add load/store in powerpc annotate
instructions for data type profling
  tools/erf/util/annotate: Set register_char and memory_ref_char for
powerpc
  tools/perf/arch/powerc: Add get_arch_regnum for powerpc

 .../perf/arch/powerpc/annotate/instructions.c | 66 +++
 tools/perf/arch/powerpc/util/dwarf-regs.c | 29 
 tools/perf/util/annotate.c|  5 ++
 3 files changed, 100 insertions(+)

-- 
2.43.0



Re: [PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling

2024-03-08 Thread Athira Rajeev
Hi All,

Please ignore this version. I made mistake in cover letter. I am re-posting the 
correct version now.
Sorry for the confusion

Thanks
Athira

> On 09-Mar-2024, at 11:21 AM, Athira Rajeev  
> wrote:
> 
> Add powerpc instruction nmemonic table to associate load/store
> instructions with move_ops. mov_ops is used to identify mem_type
> to associate instruction with data type and offset. Also initialize
> and allocate arch specific fields for nr_instructions, instructions and
> nr_instructions_allocate.
> 
> Signed-off-by: Athira Rajeev 
> ---
> .../perf/arch/powerpc/annotate/instructions.c | 66 +++
> 1 file changed, 66 insertions(+)
> 
> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c 
> b/tools/perf/arch/powerpc/annotate/instructions.c
> index a3f423c27cae..07af4442be38 100644
> --- a/tools/perf/arch/powerpc/annotate/instructions.c
> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
> @@ -1,6 +1,65 @@
> // SPDX-License-Identifier: GPL-2.0
> #include 
> 
> +/*
> + * powerpc instruction nmemonic table to associate load/store instructions 
> with
> + * move_ops. mov_ops is used to identify mem_type to associate instruction 
> with
> + * data type and offset.
> + */
> +static struct ins powerpc__instructions[] = {
> + { .name = "lbz", .ops = _ops,  },
> + { .name = "lbzx", .ops = _ops,  },
> + { .name = "lbzu", .ops = _ops,  },
> + { .name = "lbzux", .ops = _ops,  },
> + { .name = "lhz", .ops = _ops,  },
> + { .name = "lhzx", .ops = _ops,  },
> + { .name = "lhzu", .ops = _ops,  },
> + { .name = "lhzux", .ops = _ops,  },
> + { .name = "lha", .ops = _ops,  },
> + { .name = "lhax", .ops = _ops,  },
> + { .name = "lhau", .ops = _ops,  },
> + { .name = "lhaux", .ops = _ops,  },
> + { .name = "lwz", .ops = _ops,  },
> + { .name = "lwzx", .ops = _ops,  },
> + { .name = "lwzu", .ops = _ops,  },
> + { .name = "lwzux", .ops = _ops,  },
> + { .name = "lwa", .ops = _ops,  },
> + { .name = "lwax", .ops = _ops,  },
> + { .name = "lwaux", .ops = _ops,  },
> + { .name = "ld", .ops = _ops,  },
> + { .name = "ldx", .ops = _ops,  },
> + { .name = "ldu", .ops = _ops,  },
> + { .name = "ldux", .ops = _ops,  },
> + { .name = "stb", .ops = _ops,  },
> + { .name = "stbx", .ops = _ops,  },
> + { .name = "stbu", .ops = _ops,  },
> + { .name = "stbux", .ops = _ops,  },
> + { .name = "sth", .ops = _ops,  },
> + { .name = "sthx", .ops = _ops,  },
> + { .name = "sthu", .ops = _ops,  },
> + { .name = "sthux", .ops = _ops,  },
> + { .name = "stw", .ops = _ops,  },
> + { .name = "stwx", .ops = _ops,  },
> + { .name = "stwu", .ops = _ops,  },
> + { .name = "stwux", .ops = _ops,  },
> + { .name = "std", .ops = _ops,  },
> + { .name = "stdx", .ops = _ops,  },
> + { .name = "stdu", .ops = _ops,  },
> + { .name = "stdux", .ops = _ops,  },
> + { .name = "lhbrx", .ops = _ops,  },
> + { .name = "sthbrx", .ops = _ops,  },
> + { .name = "lwbrx", .ops = _ops,  },
> + { .name = "stwbrx", .ops = _ops,  },
> + { .name = "ldbrx", .ops = _ops,  },
> + { .name = "stdbrx", .ops = _ops,  },
> + { .name = "lmw", .ops = _ops,  },
> + { .name = "stmw", .ops = _ops,  },
> + { .name = "lswi", .ops = _ops,  },
> + { .name = "lswx", .ops = _ops,  },
> + { .name = "stswi", .ops = _ops,  },
> + { .name = "stswx", .ops = _ops,  },
> +};
> +
> static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, 
> const char *name)
> {
> int i;
> @@ -52,6 +111,13 @@ static struct ins_ops 
> *powerpc__associate_instruction_ops(struct arch *arch, con
> static int powerpc__annotate_init(struct arch *arch, char *cpuid 
> __maybe_unused)
> {
> if (!arch->initialized) {
> + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions);
> + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins));
> + if (arch->instructions == NULL)
> + return -ENOMEM;
> +
> + memcpy(arch->instructions, (struct ins *)powerpc__instructions, 
> sizeof(struct ins) * arch->nr_instructions);
> + arch->nr_instructions_allocated = arch->nr_instructions;
> arch->initialized = true;
> arch->associate_instruction_ops = powerpc__associate_instruction_ops;
> arch->objdump.comment_char  = '#';
> -- 
> 2.43.0
> 



[PATCH 3/3] tools/perf/arch/powerc: Add get_arch_regnum for powerpc

2024-03-08 Thread Athira Rajeev
The function get_dwarf_regnum() returns a DWARF register number
from a register name string. This calls arch specific function
get_arch_regnum to return register number for corresponding arch.
Add mappings for register name to register number in powerpc code:
arch/powerpc/util/dwarf-regs.c

Signed-off-by: Athira Rajeev 
---
 tools/perf/arch/powerpc/util/dwarf-regs.c | 29 +++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c 
b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 0c4f4caf53ac..d955e3e577ea 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -98,3 +98,32 @@ int regs_query_register_offset(const char *name)
return roff->ptregs_offset;
return -EINVAL;
 }
+
+struct dwarf_regs_idx {
+   const char *name;
+   int idx;
+};
+
+static const struct dwarf_regs_idx powerpc_regidx_table[] = {
+   { "r0", 0 }, { "r1", 1 }, { "r2", 2 }, { "r3", 3 }, { "r4", 4 },
+   { "r5", 5 }, { "r6", 6 }, { "r7", 7 }, { "r8", 8 }, { "r9", 9 },
+   { "r10", 10 }, { "r11", 11 }, { "r12", 12 }, { "r13", 13 }, { "r14", 14 
},
+   { "r15", 15 }, { "r16", 16 }, { "r17", 17 }, { "r18", 18 }, { "r19", 19 
},
+   { "r20", 20 }, { "r21", 21 }, { "r22", 22 }, { "r23", 23 }, { "r24", 24 
},
+   { "r25", 25 }, { "r26", 26 }, { "r27", 27 }, { "r27", 27 }, { "r28", 28 
},
+   { "r29", 29 }, { "r30", 30 }, { "r31", 31 },
+};
+
+int get_arch_regnum(const char *name)
+{
+   unsigned int i;
+
+   if (*name != 'r')
+   return -EINVAL;
+
+   for (i = 0; i < ARRAY_SIZE(powerpc_regidx_table); i++)
+   if (!strcmp(powerpc_regidx_table[i].name, name))
+   return powerpc_regidx_table[i].idx;
+
+   return -ENOENT;
+}
-- 
2.43.0



[PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc

2024-03-08 Thread Athira Rajeev
To identify if the instruction has any memory reference,
"memory_ref_char" field needs to be set for specific architecture.
Example memory instruction:
lwz r10,0(r9)

Here "(" is the memory_ref_char. Set this as part of arch->objdump

To get register number and access offset from the given instruction,
arch->objdump.register_char is used. In case of powerpc, the register
char is "r" (since reg names are r0 to r31). Hence set register_char
as "r".

Signed-off-by: Athira Rajeev 
---
 tools/perf/util/annotate.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ac002d907d81..d69bd6edafcb 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -216,6 +216,11 @@ static struct arch architectures[] = {
{
.name = "powerpc",
.init = powerpc__annotate_init,
+   .objdump =  {
+   .comment_char = '#',
+   .register_char = 'r',
+   .memory_ref_char = '(',
+   },
},
{
.name = "riscv64",
-- 
2.43.0



[PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling

2024-03-08 Thread Athira Rajeev
Add powerpc instruction nmemonic table to associate load/store
instructions with move_ops. mov_ops is used to identify mem_type
to associate instruction with data type and offset. Also initialize
and allocate arch specific fields for nr_instructions, instructions and
nr_instructions_allocate.

Signed-off-by: Athira Rajeev 
---
 .../perf/arch/powerpc/annotate/instructions.c | 66 +++
 1 file changed, 66 insertions(+)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c 
b/tools/perf/arch/powerpc/annotate/instructions.c
index a3f423c27cae..07af4442be38 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -1,6 +1,65 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 
+/*
+ * powerpc instruction nmemonic table to associate load/store instructions with
+ * move_ops. mov_ops is used to identify mem_type to associate instruction with
+ * data type and offset.
+ */
+static struct ins powerpc__instructions[] = {
+   { .name = "lbz",.ops = _ops,  },
+   { .name = "lbzx",   .ops = _ops,  },
+   { .name = "lbzu",   .ops = _ops,  },
+   { .name = "lbzux",  .ops = _ops,  },
+   { .name = "lhz",.ops = _ops,  },
+   { .name = "lhzx",   .ops = _ops,  },
+   { .name = "lhzu",   .ops = _ops,  },
+   { .name = "lhzux",  .ops = _ops,  },
+   { .name = "lha",.ops = _ops,  },
+   { .name = "lhax",   .ops = _ops,  },
+   { .name = "lhau",   .ops = _ops,  },
+   { .name = "lhaux",  .ops = _ops,  },
+   { .name = "lwz",.ops = _ops,  },
+   { .name = "lwzx",   .ops = _ops,  },
+   { .name = "lwzu",   .ops = _ops,  },
+   { .name = "lwzux",  .ops = _ops,  },
+   { .name = "lwa",.ops = _ops,  },
+   { .name = "lwax",   .ops = _ops,  },
+   { .name = "lwaux",  .ops = _ops,  },
+   { .name = "ld", .ops = _ops,  },
+   { .name = "ldx",.ops = _ops,  },
+   { .name = "ldu",.ops = _ops,  },
+   { .name = "ldux",   .ops = _ops,  },
+   { .name = "stb",.ops = _ops,  },
+   { .name = "stbx",   .ops = _ops,  },
+   { .name = "stbu",   .ops = _ops,  },
+   { .name = "stbux",  .ops = _ops,  },
+   { .name = "sth",.ops = _ops,  },
+   { .name = "sthx",   .ops = _ops,  },
+   { .name = "sthu",   .ops = _ops,  },
+   { .name = "sthux",  .ops = _ops,  },
+   { .name = "stw",.ops = _ops,  },
+   { .name = "stwx",   .ops = _ops,  },
+   { .name = "stwu",   .ops = _ops,  },
+   { .name = "stwux",  .ops = _ops,  },
+   { .name = "std",.ops = _ops,  },
+   { .name = "stdx",   .ops = _ops,  },
+   { .name = "stdu",   .ops = _ops,  },
+   { .name = "stdux",  .ops = _ops,  },
+   { .name = "lhbrx",  .ops = _ops,  },
+   { .name = "sthbrx", .ops = _ops,  },
+   { .name = "lwbrx",  .ops = _ops,  },
+   { .name = "stwbrx", .ops = _ops,  },
+   { .name = "ldbrx",  .ops = _ops,  },
+   { .name = "stdbrx", .ops = _ops,  },
+   { .name = "lmw",.ops = _ops,  },
+   { .name = "stmw",   .ops = _ops,  },
+   { .name = "lswi",   .ops = _ops,  },
+   { .name = "lswx",   .ops = _ops,  },
+   { .name = "stswi",  .ops = _ops,  },
+   { .name = "stswx",  .ops = _ops,  },
+};
+
 static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, 
const char *name)
 {
int i;
@@ -52,6 +111,13 @@ static struct ins_ops 
*powerpc__associate_instruction_ops(struct arch *arch, con
 static int powerpc__annotate_init(struct arch *arch, char *cpuid 
__maybe_unused)
 {
if (!arch->initialized) {
+   arch->nr_instructions = ARRAY_SIZE(powerpc__instructions);
+   arch->instructions = calloc(arch->nr_instructions, 
sizeof(struct ins));
+   if (arch->instructions == NULL)
+   return -ENOMEM;
+
+   memcpy(arch->instructions, (struct ins *)powerpc__instructions, 
sizeof(struct ins) * arch->nr_instructions);
+   arch->nr_instructions_allocated = arch->nr_instructions;
arch->initialized = true;
arch->associate_instruction_ops = 
powerpc__associate_instruction_ops;
arch->objdump.comment_char  = '#';
-- 
2.43.0



[PATCH 0/3] Add data type profiling support for powerpc

2024-03-08 Thread Athira Rajeev
From: Akanksha J N 

The patchset from Namhyung added support for data type profiling
in perf tool. This enabled support to associate PMU samples to data
types they refer using DWARF debug information. With the upstream
perf, currently it possible to run perf report or perf annotate to
view the data type information on x86.

This patchset includes changes need to enable data type profiling
support for powerpc. Main change are:
1. powerpc instruction nmemonic table to associate load/store
instructions with move_ops which is use to identify if instruction
is a memory access one.
2. To get register number and access offset from the given
instruction, code uses fields from "struct arch" -> objump.
Add entry for powerpc here.
3. A get_arch_regnum to return register number from the
register name string.

These three patches are the basic foundational patches.
With these changes, data types is getting identified for kernel
and user-space samples. There are still samples, which comes as
"unknown" and needs to be checked. We plan to get those addressed
in follow up. With the current patchset:

# ./perf record -a -e mem-loads sleep 1
# ./perf report -s type,typeoff --hierarchy --group --stdio
Snippet of logs:
#
# Total Lost Samples: 0
#
# Samples: 277  of events 'mem-loads, dummy:u'
# Event count (approx.): 149813
#
#Overhead  Data Type / Data Type Offset
# ...  
#
65.93%   0.00% (unknown)
   65.93%   0.00% (unknown) +0 (no field)
 8.19%   0.00% struct vm_area_struct
8.19%   0.00% struct vm_area_struct +136 (vm_file)
 4.53%   0.00% struct rq
3.14%   0.00% struct rq +0 (__lock.raw_lock.val)
0.83%   0.00% struct rq +3216 (avg_irq.runnable_sum)
0.24%   0.00% struct rq +4 (nr_running)
0.14%   0.00% struct rq +12 (nr_preferred_running)
0.12%   0.00% struct rq +2760 (sd)
0.06%   0.00% struct rq +3368 (prev_steal_time_rq)
0.01%   0.00% struct rq +2592 (curr)
 3.53%   0.00% struct rb_node
3.53%   0.00% struct rb_node +0 (__rb_parent_color)
 3.43%   0.00% struct slab
3.43%   0.00% struct slab +32 (freelist)
 3.30%   0.00% unsigned int
3.30%   0.00% unsigned int +0 (no field)
 3.22%   0.00% struct vm_fault
3.22%   0.00% struct vm_fault +48 (pmd)
 2.55%   0.00% unsigned char
2.55%   0.00% unsigned char +0 (no field)
 1.06%   0.00% struct task_struct
1.06%   0.00% struct task_struct +4 (thread_info.cpu)
 0.92%   0.00% void*
0.92%   0.00% void* +0 (no field)
 0.74%   0.00% __int128 unsigned
0.74%   0.00% __int128 unsigned +8 (no field)
 0.59%   0.00% struct perf_event
0.54%   0.00% struct perf_event +552 (ctx)
0.04%   0.00% struct perf_event +152 (pmu)
 0.20%   0.00% struct sched_entity
0.20%   0.00% struct sched_entity +0 (load.weight)
 0.18%   0.00% struct cfs_rq
0.18%   0.00% struct cfs_rq +96 (curr)

Thanks
Athira

Athira Rajeev (3):
  tools/perf/arch/powerpc: Add load/store in powerpc annotate
instructions for data type profling
  tools/erf/util/annotate: Set register_char and memory_ref_char for
powerpc
  tools/perf/arch/powerc: Add get_arch_regnum for powerpc

 .../perf/arch/powerpc/annotate/instructions.c | 66 +++
 tools/perf/arch/powerpc/util/dwarf-regs.c | 29 
 tools/perf/util/annotate.c|  5 ++
 3 files changed, 100 insertions(+)

-- 
2.43.0



Re: [PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-12-05 Thread Athira Rajeev



> On 06-Dec-2023, at 3:20 AM, Arnaldo Carvalho de Melo  wrote:
> 
> Em Mon, Nov 27, 2023 at 11:12:57AM +, James Clark escreveu:
>> On 23/11/2023 16:02, Athira Rajeev wrote:
>>> --- a/tools/perf/Makefile.perf
>>> @@ -1134,6 +1152,7 @@ bpf-skel-clean:
>>> $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>>> 
>>> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean 
>>> $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean 
>>> bpf-skel-clean tests-coresight-targets-clean
>>> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
>>> $(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive 
>>> $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>>> $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete 
>>> -o -name '\.*.d' -delete
>>> $(Q)$(RM) $(OUTPUT).config-detected
> 
> While merging perf-tools-next with torvalds/master I noticed that maybe
> we better have the above added line as:
> 
> +   $(call QUIET_CLEAN, tests) $(Q)$(MAKE) -f 
> $(srctree)/tools/perf/tests/Makefile.tests clean
> 
> No?
> 
> Anyway I'm merging as-is, but it just hit my eye while merging,
> 
> - Arnaldo
Hi Arnaldo

As Ian pointed we removed Makefile.tests as part of :
https://lore.kernel.org/lkml/20231129213428.2227448-1-irog...@google.com/

Thanks
Athira

Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-12-04 Thread Athira Rajeev



> On 05-Dec-2023, at 2:43 AM, Ian Rogers  wrote:
> 
> On Mon, Dec 4, 2023 at 12:22 PM Arnaldo Carvalho de Melo
>  wrote:
>> 
>> Em Mon, Dec 04, 2023 at 05:20:46PM -0300, Arnaldo Carvalho de Melo escreveu:
>>> Em Mon, Dec 04, 2023 at 12:12:54PM -0800, Ian Rogers escreveu:
>>>> On Thu, Nov 23, 2023 at 8:01 AM Athira Rajeev
>>>>  wrote:
>>>>> 
>>>>> Running "perf list" on powerpc fails with segfault
>>>>> as below:
>>>>> 
>>>>>   ./perf list
>>>>>   Segmentation fault (core dumped)
>>>>> 
>>>>> This happens because of duplicate events in the json list.
>>>>> The powerpc Json event list contains some event with same
>>>>> event name, but different event code. They are:
>>>>> - PM_INST_FROM_L3MISS (Present in datasource and frontend)
>>>>> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
>>>>> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
>>>>> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
>>>>> 
>>>>> pmu_events_table__num_events uses the value from
>>>>> table_pmu->num_entries which includes duplicate events as
>>>>> well. This causes issue during "perf list" and results in
>>>>> segmentation fault.
>>>>> 
>>>>> Since both event codes are valid, append _DSRC to the Data
>>>>> Source events (datasource.json), so that they would have a
>>>>> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
>>>>> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
>>>>> works as expected.
>>>>> 
>>>>> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
>>>>> Signed-off-by: Athira Rajeev 
>>>> 
>>>> Given duplicate events creates broken pmu-events.c we should capture
>>>> that as an exception in jevents.py. That way a JEVENTS_ARCH=all build
>>>> will fail if any vendor/architecture would break in this way. We
>>>> should also add JEVENTS_ARCH=all to tools/perf/tests/make. Athira, do
>>>> you want to look at doing this?
>>> 
>>> Should I go ahead and remove this patch till this is sorted out?
>> 
>> I'll keep it, its already in tmp.perf-tools-next, we can go from there
>> and improve this with follow up patches,

Thanks Arnaldo for pulling the fix patch.

> 
> Agreed. I could look to do the follow up but likely won't have a
> chance for a while. If others could help out it would be great. I'd
> like to have the jevents and json be robust enough that we don't trip
> over problems like this and the somewhat similar AmpereOne issue.

Yes Ian.

I will look at adding this with follow up patches and including this as part of 
tools/perf/tests/make

Thanks
Athira
> 
> Thanks,
> Ian
> 
>> - Arnaldo




Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-12-04 Thread Athira Rajeev



> On 05-Dec-2023, at 1:42 AM, Ian Rogers  wrote:
> 
> On Thu, Nov 23, 2023 at 8:01 AM Athira Rajeev
>  wrote:
>> 
>> Running "perf list" on powerpc fails with segfault
>> as below:
>> 
>>   ./perf list
>>   Segmentation fault (core dumped)
>> 
>> This happens because of duplicate events in the json list.
>> The powerpc Json event list contains some event with same
>> event name, but different event code. They are:
>> - PM_INST_FROM_L3MISS (Present in datasource and frontend)
>> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
>> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
>> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
>> 
>> pmu_events_table__num_events uses the value from
>> table_pmu->num_entries which includes duplicate events as
>> well. This causes issue during "perf list" and results in
>> segmentation fault.
>> 
>> Since both event codes are valid, append _DSRC to the Data
>> Source events (datasource.json), so that they would have a
>> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
>> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
>> works as expected.
>> 
>> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
>> Signed-off-by: Athira Rajeev 
> 
> Given duplicate events creates broken pmu-events.c we should capture
> that as an exception in jevents.py. That way a JEVENTS_ARCH=all build
> will fail if any vendor/architecture would break in this way. We
> should also add JEVENTS_ARCH=all to tools/perf/tests/make. Athira, do
> you want to look at doing this?
> 
> Thanks,
> Ian

Hi Ian,

That’s a great suggestion. This will definitely help to capture the issues 
ahead.
I am interested and will work on adding this as part of tools/perf/tests/make

Thanks
Athira
> 
>> ---
>> .../arch/powerpc/power10/datasource.json   | 18 ++
>> 1 file changed, 14 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json 
>> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>> index 6b0356f2d301..0eeaaf1a95b8 100644
>> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>> @@ -99,6 +99,11 @@
>> "EventName": "PM_INST_FROM_L2MISS",
>> "BriefDescription": "The processor's instruction cache was reloaded from 
>> a source beyond the local core's L2 due to a demand miss."
>>   },
>> +  {
>> +"EventCode": "0x0003C000C040",
>> +"EventName": "PM_DATA_FROM_L2MISS_DSRC",
>> +"BriefDescription": "The processor's L1 data cache was reloaded from a 
>> source beyond the local core's L2 due to a demand miss."
>> +  },
>>   {
>> "EventCode": "0x00038010C040",
>> "EventName": "PM_INST_FROM_L2MISS_ALL",
>> @@ -161,9 +166,14 @@
>>   },
>>   {
>> "EventCode": "0x00078000C040",
>> -"EventName": "PM_INST_FROM_L3MISS",
>> +"EventName": "PM_INST_FROM_L3MISS_DSRC",
>> "BriefDescription": "The processor's instruction cache was reloaded from 
>> beyond the local core's L3 due to a demand miss."
>>   },
>> +  {
>> +"EventCode": "0x0007C000C040",
>> +"EventName": "PM_DATA_FROM_L3MISS_DSRC",
>> +"BriefDescription": "The processor's L1 data cache was reloaded from 
>> beyond the local core's L3 due to a demand miss."
>> +  },
>>   {
>> "EventCode": "0x00078010C040",
>> "EventName": "PM_INST_FROM_L3MISS_ALL",
>> @@ -981,7 +991,7 @@
>>   },
>>   {
>> "EventCode": "0x0003C000C142",
>> -"EventName": "PM_MRK_DATA_FROM_L2MISS",
>> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
>> "BriefDescription": "The processor's L1 data cache was reloaded from a 
>> source beyond the local core's L2 due to a demand miss for a marked 
>> instruction."
>>   },
>>   {
>> @@ -1046,12 +1056,12 @@
>>   },
>>   {
>> "EventCode": "0x00078000C142",
>> -"EventName": "PM_MRK_INST_FROM_L3MISS",
>> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
>> "BriefDescription": "The processor's instruction cache was reloaded from 
>> beyond the local core's L3 due to a demand miss for a marked instruction."
>>   },
>>   {
>> "EventCode": "0x0007C000C142",
>> -"EventName": "PM_MRK_DATA_FROM_L3MISS",
>> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
>> "BriefDescription": "The processor's L1 data cache was reloaded from 
>> beyond the local core's L3 due to a demand miss for a marked instruction."
>>   },
>>   {
>> --
>> 2.39.3




Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-12-03 Thread Athira Rajeev



> On 29-Nov-2023, at 10:51 AM, Athira Rajeev  
> wrote:
> 
> 
> 
>> On 27-Nov-2023, at 5:32 PM, Disha Goel  wrote:
>> 
>> On 23/11/23 9:31 pm, Athira Rajeev wrote:
>> 
>>> Running "perf list" on powerpc fails with segfault
>>> as below:
>>> 
>>>   ./perf list
>>>   Segmentation fault (core dumped)
>>> 
>>> This happens because of duplicate events in the json list.
>>> The powerpc Json event list contains some event with same
>>> event name, but different event code. They are:
>>> - PM_INST_FROM_L3MISS (Present in datasource and frontend)
>>> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
>>> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
>>> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
>>> 
>>> pmu_events_table__num_events uses the value from
>>> table_pmu->num_entries which includes duplicate events as
>>> well. This causes issue during "perf list" and results in
>>> segmentation fault.
>>> 
>>> Since both event codes are valid, append _DSRC to the Data
>>> Source events (datasource.json), so that they would have a
>>> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
>>> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
>>> works as expected.
>>> 
>>> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
>>> Signed-off-by: Athira Rajeev 
>> 
>> I have tested the patch on Power10 machine. Perf list works correctly 
>> without any segfault now.
>> 
>> # ./perf list
>> 
>> List of pre-defined events (to be used in -e or -M):
>> 
>>  branch-instructions OR branches[Hardware event]
>>  branch-misses  [Hardware event]
>> 
>> Tested-by: Disha Goel 
>> 
> 
> Thanks Disha for testing
> 
> Athira
Hi Arnaldo,

Can we get this pulled in if the patch looks good ?

Thanks
Athira

>>> ---
>>> .../arch/powerpc/power10/datasource.json   | 18 ++
>>> 1 file changed, 14 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json 
>>> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>>> index 6b0356f2d301..0eeaaf1a95b8 100644
>>> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>>> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>>> @@ -99,6 +99,11 @@
>>> "EventName": "PM_INST_FROM_L2MISS",
>>> "BriefDescription": "The processor's instruction cache was reloaded 
>>> from a source beyond the local core's L2 due to a demand miss."
>>>   },
>>> +  {
>>> +"EventCode": "0x0003C000C040",
>>> +"EventName": "PM_DATA_FROM_L2MISS_DSRC",
>>> +"BriefDescription": "The processor's L1 data cache was reloaded from a 
>>> source beyond the local core's L2 due to a demand miss."
>>> +  },
>>>   {
>>> "EventCode": "0x00038010C040",
>>> "EventName": "PM_INST_FROM_L2MISS_ALL",
>>> @@ -161,9 +166,14 @@
>>>   },
>>>   {
>>> "EventCode": "0x00078000C040",
>>> -"EventName": "PM_INST_FROM_L3MISS",
>>> +"EventName": "PM_INST_FROM_L3MISS_DSRC",
>>> "BriefDescription": "The processor's instruction cache was reloaded 
>>> from beyond the local core's L3 due to a demand miss."
>>>   },
>>> +  {
>>> +"EventCode": "0x0007C000C040",
>>> +"EventName": "PM_DATA_FROM_L3MISS_DSRC",
>>> +"BriefDescription": "The processor's L1 data cache was reloaded from 
>>> beyond the local core's L3 due to a demand miss."
>>> +  },
>>>   {
>>> "EventCode": "0x00078010C040",
>>> "EventName": "PM_INST_FROM_L3MISS_ALL",
>>> @@ -981,7 +991,7 @@
>>>   },
>>>   {
>>> "EventCode": "0x0003C000C142",
>>> -"EventName": "PM_MRK_DATA_FROM_L2MISS",
>>> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
>>> "BriefDescription": "The processor's L1 data cache was reloaded from a 
>>> source beyond the local core's L2 due to a demand miss for a marked 
>>> instruction."
>>>   },
>>>   {
>>> @@ -1046,12 +1056,12 @@
>>>   },
>>>   {
>>> "EventCode": "0x00078000C142",
>>> -"EventName": "PM_MRK_INST_FROM_L3MISS",
>>> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
>>> "BriefDescription": "The processor's instruction cache was reloaded 
>>> from beyond the local core's L3 due to a demand miss for a marked 
>>> instruction."
>>>   },
>>>   {
>>> "EventCode": "0x0007C000C142",
>>> -"EventName": "PM_MRK_DATA_FROM_L3MISS",
>>> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
>>> "BriefDescription": "The processor's L1 data cache was reloaded from 
>>> beyond the local core's L3 due to a demand miss for a marked instruction."
>>>   },
>>>   {




Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-11-28 Thread Athira Rajeev



> On 27-Nov-2023, at 5:32 PM, Disha Goel  wrote:
> 
> On 23/11/23 9:31 pm, Athira Rajeev wrote:
> 
>> Running "perf list" on powerpc fails with segfault
>> as below:
>> 
>>./perf list
>>Segmentation fault (core dumped)
>> 
>> This happens because of duplicate events in the json list.
>> The powerpc Json event list contains some event with same
>> event name, but different event code. They are:
>> - PM_INST_FROM_L3MISS (Present in datasource and frontend)
>> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
>> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
>> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
>> 
>> pmu_events_table__num_events uses the value from
>> table_pmu->num_entries which includes duplicate events as
>> well. This causes issue during "perf list" and results in
>> segmentation fault.
>> 
>> Since both event codes are valid, append _DSRC to the Data
>> Source events (datasource.json), so that they would have a
>> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
>> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
>> works as expected.
>> 
>> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
>> Signed-off-by: Athira Rajeev 
> 
> I have tested the patch on Power10 machine. Perf list works correctly without 
> any segfault now.
> 
> # ./perf list
> 
> List of pre-defined events (to be used in -e or -M):
> 
>   branch-instructions OR branches[Hardware event]
>   branch-misses  [Hardware event]
> 
> Tested-by: Disha Goel 
> 

Thanks Disha for testing

Athira
>> ---
>>  .../arch/powerpc/power10/datasource.json   | 18 ++
>>  1 file changed, 14 insertions(+), 4 deletions(-)
>> 
>> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json 
>> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>> index 6b0356f2d301..0eeaaf1a95b8 100644
>> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
>> @@ -99,6 +99,11 @@
>>  "EventName": "PM_INST_FROM_L2MISS",
>>  "BriefDescription": "The processor's instruction cache was reloaded 
>> from a source beyond the local core's L2 due to a demand miss."
>>},
>> +  {
>> +"EventCode": "0x0003C000C040",
>> +"EventName": "PM_DATA_FROM_L2MISS_DSRC",
>> +"BriefDescription": "The processor's L1 data cache was reloaded from a 
>> source beyond the local core's L2 due to a demand miss."
>> +  },
>>{
>>  "EventCode": "0x00038010C040",
>>  "EventName": "PM_INST_FROM_L2MISS_ALL",
>> @@ -161,9 +166,14 @@
>>},
>>{
>>  "EventCode": "0x00078000C040",
>> -"EventName": "PM_INST_FROM_L3MISS",
>> +"EventName": "PM_INST_FROM_L3MISS_DSRC",
>>  "BriefDescription": "The processor's instruction cache was reloaded 
>> from beyond the local core's L3 due to a demand miss."
>>},
>> +  {
>> +"EventCode": "0x0007C000C040",
>> +"EventName": "PM_DATA_FROM_L3MISS_DSRC",
>> +"BriefDescription": "The processor's L1 data cache was reloaded from 
>> beyond the local core's L3 due to a demand miss."
>> +  },
>>{
>>  "EventCode": "0x00078010C040",
>>  "EventName": "PM_INST_FROM_L3MISS_ALL",
>> @@ -981,7 +991,7 @@
>>},
>>{
>>  "EventCode": "0x0003C000C142",
>> -"EventName": "PM_MRK_DATA_FROM_L2MISS",
>> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
>>  "BriefDescription": "The processor's L1 data cache was reloaded from a 
>> source beyond the local core's L2 due to a demand miss for a marked 
>> instruction."
>>},
>>{
>> @@ -1046,12 +1056,12 @@
>>},
>>{
>>  "EventCode": "0x00078000C142",
>> -"EventName": "PM_MRK_INST_FROM_L3MISS",
>> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
>>  "BriefDescription": "The processor's instruction cache was reloaded 
>> from beyond the local core's L3 due to a demand miss for a marked 
>> instruction."
>>},
>>{
>>  "EventCode": "0x0007C000C142",
>> -"EventName": "PM_MRK_DATA_FROM_L3MISS",
>> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
>>  "BriefDescription": "The processor's L1 data cache was reloaded from 
>> beyond the local core's L3 due to a demand miss for a marked instruction."
>>},
>>{




Re: [PATCH] perf test record+probe_libc_inet_pton: Fix call chain match on powerpc

2023-11-28 Thread Athira Rajeev



> On 26-Nov-2023, at 12:39 PM, Likhitha Korrapati  
> wrote:
> 
> The perf test "probe libc's inet_pton & backtrace it with ping" fails on
> powerpc as below:
> 
> root@xxx perf]# perf test -v "probe libc's inet_pton & backtrace it with
> ping"
> 85: probe libc's inet_pton & backtrace it with ping :
> --- start ---
> test child forked, pid 96028
> ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60)
> 7fffa1779a60 __GI___inet_pton+0x0
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> 7fffa172a73c getaddrinfo+0x121c
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> FAIL: expected backtrace entry
> "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
> got "7fffa172a73c getaddrinfo+0x121c
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
> test child finished with -1
>  end 
> probe libc's inet_pton & backtrace it with ping: FAILED!

Reviewed-by: Athira Rajeev 

Thanks
Athira
> 
> This test installs a probe on libc's inet_pton function, which will use
> uprobes and then uses perf trace on a ping to localhost. It gets 3
> levels deep backtrace and checks whether it is what we expected or not.
> 
> The test started failing from RHEL 9.4 where as it works in previous
> distro version (RHEL 9.2). Test expects gaih_inet function to be part of
> backtrace. But in the glibc version (2.34-86) which is part of distro
> where it fails, this function is missing and hence the test is failing.
> 
> From nm and ping command output we can confirm that gaih_inet function
> is not present in the expected backtrace for glibc version glibc-2.34-86
> 
> [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
> 001273e0 t gaih_inet_serv
> 001cd8d8 r gaih_inet_typeproto
> 
> [root@xxx perf]# perf script -i /tmp/perf.data.6E8
> ping  104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60)
>7fff83779a60 __GI___inet_pton+0x0
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
>7fff8372a73c getaddrinfo+0x121c
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
>   11dc73534 [unknown] (/usr/bin/ping)
>7fff8362a8c4 __libc_start_call_main+0x84
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> 
> FAIL: expected backtrace entry
> "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
> got "7fff9d52a73c getaddrinfo+0x121c
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
> 
> With version glibc-2.34-60 gaih_inet function is present as part of the
> expected backtrace. So we cannot just remove the gaih_inet function from
> the backtrace.
> 
> [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
> 00130490 t gaih_inet.constprop.0
> 0012e830 t gaih_inet_serv
> 001d45e4 r gaih_inet_typeproto
> 
> [root@xxx perf]# ./perf script -i /tmp/perf.data.b6S
> ping   67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd80820)
>7fffbdd80820 __GI___inet_pton+0x0
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
>7fffbdd31160 gaih_inet.constprop.0+0xcd0
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
>7fffbdd31c7c getaddrinfo+0x14c
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
>   1140d3558 [unknown] (/usr/bin/ping)
> 
> This patch solves this issue by doing a conditional skip. If there is a
> gaih_inet function present in the libc then it will be added to the
> expected backtrace else the function will be skipped from being added
> to the expected backtrace.
> 
> Output with the patch
> 
> [root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it
> with ping"
> 83: probe libc's inet_pton & backtrace it with ping :
> --- start ---
> test child forked, pid 102662
> ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60)
> 7fff93379a60 __GI___inet_pton+0x0
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> 7fff9332a73c getaddrinfo+0x121c
> (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
> 11ef03534 [unknown] (/usr/bin/ping)
> test child finished with 0
>  end 
> probe libc's inet_pton & backtrace it with ping: Ok
> 
> Signed-off-by: Likhitha Korrapati 
> Reported-by: Disha Goel 
> ---
> tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 5 -
> 1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh 
> b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
> index eebeea6bdc76..72c65570db37 100755
> --- a/tools/perf/tests/shell/record+probe_libc_inet_pton

Re: [PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-11-28 Thread Athira Rajeev



> On 27-Nov-2023, at 8:21 PM, Arnaldo Carvalho de Melo  wrote:
> 
> Em Mon, Nov 27, 2023 at 11:12:57AM +, James Clark escreveu:
>> On 23/11/2023 16:02, Athira Rajeev wrote:
>>> Add rule in new Makefile "tests/Makefile.tests" for running
>>> shellcheck on shell test scripts. This automates below shellcheck
>>> into the build.
> 
>> Seems to work really well. I also tested it on Ubuntu, and checked
>> NO_SHELLCHECK, cleaning and with and without shellcheck installed etc.
> 
>> Reviewed-by: James Clark 
> 
> Tested on Fedora 38, works as advertised, applied.
> 
> - Arnaldo

Hi James, Arnaldo

Thanks for testing the patch and comments.

Athira Rajeev

[PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-11-23 Thread Athira Rajeev
Add rule in new Makefile "tests/Makefile.tests" for running
shellcheck on shell test scripts. This automates below shellcheck
into the build.

$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck 
-S warning $F; done

Condition for shellcheck is added in Makefile.perf to avoid build
breakage in the absence of shellcheck binary. Update Makefile.perf
to contain new rule for "SHELLCHECK_TEST" which is for making
shellcheck test as a dependency on perf binary.

Added "tests/Makefile.tests" to run shellcheck on shellscripts in
tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
time during make, shellcheck will be run only on modified files
during subsequent invocations. By this, if any newly added shell
scripts or fixes in existing scripts breaks coding/formatting
style, it will get captured during the perf build.

Example build failure by modifying probe_vfs_getname.sh in tests/shell:

In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
  ^---^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word 
splitt...
make[3]: *** 
[/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: 
tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

Here, like other files which gets created during
compilation (ex: .builtin-bench.o.cmd or .perf.o.cmd ),
create .shellcheck_log also as a hidden file.
Example: tests/shell/.probe_vfs_getname.sh.shellcheck_log
shellcheck is re-run if any of the script gets modified
based on its dependency of this log file.

After this, for testing, changed "tests/shell/trace+probe_vfs_getname.sh" to
break shellcheck format. In the next make run, it is
also captured:

In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
  ^---^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word 
splitt...
make[3]: *** 
[/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: 
tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[3]: *** Waiting for unfinished jobs

In tests/shell/trace+probe_vfs_getname.sh line 14:
. $(dirname $0)/lib/probe.sh
  ^---^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word 
splitt...
make[3]: *** 
[/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: 
tests/shell/.trace+probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

Failure log can be found in the stdout of make itself.

This is reported at build time. To be able to go ahead with
the build or disable shellcheck even though it is known that
some test is broken, add a "NO_SHELLCHECK" option. Example:

  make NO_SHELLCHECK=1

  INSTALL libsubcmd_headers
  INSTALL libsymbol_headers
  INSTALL libapi_headers
  INSTALL libperf_headers
  INSTALL libbpf_headers
  LINKperf

Note:
This is tested on RHEL and also SLES. Use below check:
"$(shell which shellcheck 2> /dev/null)" to look for presence
of shellcheck binary. The approach "shell command -v" is not
used here. In some of the distros(RHEL), command is available
as executable file (/usr/bin/command). But in some distros(SLES),
it is a shell builtin and not available as executable file.

Signed-off-by: Athira Rajeev 
---
changelog:
v3 -> v4:
Addressed review comments from James Clark.
- Made the shellcheck errors to be reported in make output
  itself during make like any other build error.
- Removed creating .dep files. Instead use the log file
to determine whether shellcheck has to be re-run when
there is a change in source file.
- Change log file to have suffix as shellcheck_log so
as to differentiate it from test execution log.
- Also like other files which gets created during
compilation, example, .builtin-bench.o.cmd or .perf.o.cmd,
create .shellcheck_log as hidden file.
Example: tests/shell/.buildid.sh.shellcheck_log
- Initial version used "command -v shellcheck" to check
presence of shellcheck. But while testing SLES, hit an
issue with using "command". In RHEL, /usr/bin/command
is available as pa

[PATCH] perf vendor events: Update datasource event name to fix duplicate events

2023-11-23 Thread Athira Rajeev
Running "perf list" on powerpc fails with segfault
as below:

   ./perf list
   Segmentation fault (core dumped)

This happens because of duplicate events in the json list.
The powerpc Json event list contains some event with same
event name, but different event code. They are:
- PM_INST_FROM_L3MISS (Present in datasource and frontend)
- PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
- PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
- PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)

pmu_events_table__num_events uses the value from
table_pmu->num_entries which includes duplicate events as
well. This causes issue during "perf list" and results in
segmentation fault.

Since both event codes are valid, append _DSRC to the Data
Source events (datasource.json), so that they would have a
unique name. Also add PM_DATA_FROM_L2MISS_DSRC and
PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list
works as expected.

Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events")
Signed-off-by: Athira Rajeev 
---
 .../arch/powerpc/power10/datasource.json   | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json 
b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
index 6b0356f2d301..0eeaaf1a95b8 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json
@@ -99,6 +99,11 @@
 "EventName": "PM_INST_FROM_L2MISS",
 "BriefDescription": "The processor's instruction cache was reloaded from a 
source beyond the local core's L2 due to a demand miss."
   },
+  {
+"EventCode": "0x0003C000C040",
+"EventName": "PM_DATA_FROM_L2MISS_DSRC",
+"BriefDescription": "The processor's L1 data cache was reloaded from a 
source beyond the local core's L2 due to a demand miss."
+  },
   {
 "EventCode": "0x00038010C040",
 "EventName": "PM_INST_FROM_L2MISS_ALL",
@@ -161,9 +166,14 @@
   },
   {
 "EventCode": "0x00078000C040",
-"EventName": "PM_INST_FROM_L3MISS",
+"EventName": "PM_INST_FROM_L3MISS_DSRC",
 "BriefDescription": "The processor's instruction cache was reloaded from 
beyond the local core's L3 due to a demand miss."
   },
+  {
+"EventCode": "0x0007C000C040",
+"EventName": "PM_DATA_FROM_L3MISS_DSRC",
+"BriefDescription": "The processor's L1 data cache was reloaded from 
beyond the local core's L3 due to a demand miss."
+  },
   {
 "EventCode": "0x00078010C040",
 "EventName": "PM_INST_FROM_L3MISS_ALL",
@@ -981,7 +991,7 @@
   },
   {
 "EventCode": "0x0003C000C142",
-"EventName": "PM_MRK_DATA_FROM_L2MISS",
+"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC",
 "BriefDescription": "The processor's L1 data cache was reloaded from a 
source beyond the local core's L2 due to a demand miss for a marked 
instruction."
   },
   {
@@ -1046,12 +1056,12 @@
   },
   {
 "EventCode": "0x00078000C142",
-"EventName": "PM_MRK_INST_FROM_L3MISS",
+"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC",
 "BriefDescription": "The processor's instruction cache was reloaded from 
beyond the local core's L3 due to a demand miss for a marked instruction."
   },
   {
 "EventCode": "0x0007C000C142",
-"EventName": "PM_MRK_DATA_FROM_L3MISS",
+"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC",
 "BriefDescription": "The processor's L1 data cache was reloaded from 
beyond the local core's L3 due to a demand miss for a marked instruction."
   },
   {
-- 
2.39.3



Re: [PATCH V3] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-11-06 Thread Athira Rajeev



> On 23-Oct-2023, at 4:14 PM, James Clark  wrote:
> 
> 
> 
> On 13/10/2023 08:36, Athira Rajeev wrote:
>> Add rule in new Makefile "tests/Makefile.tests" for running
>> shellcheck on shell test scripts. This automates below shellcheck
>> into the build.
>> 
>> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>> warning $F; done
>> 
>> Condition for shellcheck is added in Makefile.perf to avoid build
>> breakage in the absence of shellcheck binary. Update Makefile.perf
>> to contain new rule for "SHELLCHECK_TEST" which is for making
>> shellcheck test as a dependency on perf binary. Added
>> "tests/Makefile.tests" to run shellcheck on shellscripts in
>> tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
>> time during make, shellcheck will be run only on modified files
>> during subsequent invocations. By this, if any newly added shell
>> scripts or fixes in existing scripts breaks coding/formatting
>> style, it will get captured during the perf build.
>> 
>> Example build failure with present scripts in tests/shell:
>> 
>> INSTALL libsubcmd_headers
>> INSTALL libperf_headers
>> INSTALL libapi_headers
>> INSTALL libsymbol_headers
>> INSTALL libbpf_headers
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/record_sideband.dep] Error 1
>> make[3]: *** Waiting for unfinished jobs
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/test_arm_coresight.dep] Error 1
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/lock_contention.dep] Error 1
>> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
>> make[1]: *** [Makefile.perf:238: sub-make] Error 2
>> make: *** [Makefile:70: all] Error 2
>> 
>> After this, for testing, changed "tests/shell/record.sh" to
>> break shellcheck format. In the next make run, it is
>> also captured:
>> 
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/record_sideband.dep] Error 1
>> make[3]: *** Waiting for unfinished jobs
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/record.dep] Error 1
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/test_arm_coresight.dep] Error 1
>> make[3]: *** 
>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>> output/tests/shell/lock_contention.dep] Error 1
>> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
>> make[1]: *** [Makefile.perf:238: sub-make] Error 2
>> make: *** [Makefile:70: all] Error 2
>> 
>> The exact failure log can be found in:
>> output/tests/shell/record.dep.log file
>> 
> 
> Hi Athira,
> 
> Having the reason for a hard failure put into a log file rather than the
> console output is very non standard. I'm not sure what the reason for
> this is.
> 
> The log filename isn't even listed in the output so how would anyone
> know what went wrong?
> 
> Can we just have it so that the failure is printed in the make output
> like any other build error.

Sure James, Thanks for looking into and sharing the review comment.
I will address the change in V4

> 
> [...]
> 
>> +output/%.dep: %.sh | $(DIRS)
>> + $(call rule_mkdir)
>> + $(eval input_file := $(subst output/,./,$(patsubst %.dep, %.sh, $@)))
>> + $(Q)$(call frecho-cmd,test)@shellcheck -S warning ${input_file} 1> $@.log 
>> && ([[ ! -s $@.log ]])
> 
> [[ ]] is a bash extension, but the build system seems to use /bin/sh so
> you get this error depending on your distro:
> 
>  tools/perf/tests/Makefile.tests:17: output/tests/shell
>  /record+probe_libc_inet_pton.dep] Error 127
>  /bin/sh: 1: [[: not found
> 
> Changing it to [ ] fixes it

Ok, will make the change in next version

> 
>> + $(Q)$(call frecho-cmd,test)@touch $@
> 
> Touching the source file in the build system doesn't feel right, surely
> this could be open to all kinds of parallel build race conditions or
> version controll issues.
> 
> Isn't the output of the rule the .log file, so just a normal make rule
> based on those two files work? Then if the .log file is older than the
> source file, the shellcheck is re-run, otherwise not? It feels like the
> .dep file would then also be unecessary.

Ok, I will fix this.
> 
> The .dep lines in the make output are a bit confusing because they're
> not in the source tree so it's not clear to an outsider what that make
> output is for.
> 
> Other than that, it does seem to work ok for me.
Thanks for the review. I will post V4 with all the changes

Athira
> 
>> + $(Q)$(call frecho-cmd,test)@rm -rf $@.log
>> +$(DIRS):
>> + @mkdir -p $@
>> +
>> +clean:
>> + @rm -rf output




Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh

2023-11-06 Thread Athira Rajeev



> On 07-Nov-2023, at 3:14 AM, Arnaldo Carvalho de Melo  wrote:
> 
> Em Thu, Oct 05, 2023 at 02:24:15PM +0530, Athira Rajeev escreveu:
>>> On 05-Oct-2023, at 1:50 PM, James Clark  wrote:
>>> On 29/09/2023 05:11, Athira Rajeev wrote:
>>>> Running shellcheck on tests/shell/test_arm_coresight.sh
>>>> throws below warnings:
>>>> 
>>>> In tests/shell/test_arm_coresight.sh line 15:
>>>> cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
>>>> -print -quit)
>>>> ^--^ SC2061: Quote the parameter to -name so the shell 
>>>> won't interpret it.
>>>> 
>>>> In tests/shell/test_arm_coresight.sh line 20:
>>>> if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then
>>>> ^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q 
>>>> ] is not well defined
>>>> 
>>>> This warning is observed after commit:
>>>> "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")"
>>>> 
>>>> Fixed this issue by using quoting 'cpu*' for SC2061 and
>>>> using "&&" in line number 20 for SC2166 warning
>>>> 
>>>> Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")
>>>> Signed-off-by: Athira Rajeev 
>>>> ---
>>>> tools/perf/tests/shell/test_arm_coresight.sh | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/tools/perf/tests/shell/test_arm_coresight.sh 
>>>> b/tools/perf/tests/shell/test_arm_coresight.sh
>>>> index fe78c4626e45..f2115dfa24a5 100755
>>>> --- a/tools/perf/tests/shell/test_arm_coresight.sh
>>>> +++ b/tools/perf/tests/shell/test_arm_coresight.sh
>>>> @@ -12,12 +12,12 @@
>>>> glb_err=0
>>>> 
>>>> cs_etm_dev_name() {
>>>> - cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
>>>> -print -quit)
>>>> + cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' 
>>>> -print -quit)
>>>> trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch)
>>>> archhver=$((($trcdevarch >> 12) & 0xf))
>>>> archpart=$(($trcdevarch & 0xfff))
>>>> 
>>>> - if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then
>>>> + if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; 
>>>> then
>>>> echo "ete"
>>>> else
>>>> echo "etm"
>>> 
>>> 
>>> Reviewed-by: James Clark 
> 
> Some are not applying, even after b4 picking up v2
> 
> Total patches: 3
> ---
> Cover: 
> ./v2_20231013_atrajeev_fix_for_shellcheck_issues_with_latest_scripts_in_tests_shell.cover
> Link: 
> https://lore.kernel.org/r/20231013073021.99794-1-atraj...@linux.vnet.ibm.com
> Base: not specified
>   git am 
> ./v2_20231013_atrajeev_fix_for_shellcheck_issues_with_latest_scripts_in_tests_shell.mbx
> ⬢[acme@toolbox perf-tools-next]$git am 
> ./v2_20231013_atrajeev_fix_for_shellcheck_issues_with_latest_scripts_in_tests_shell.mbx
> Applying: tools/perf/tests Ignore the shellcheck SC2046 warning in 
> lock_contention
> error: patch failed: tools/perf/tests/shell/lock_contention.sh:33
> error: tools/perf/tests/shell/lock_contention.sh: patch does not apply
> Patch failed at 0001 tools/perf/tests Ignore the shellcheck SC2046 warning in 
> lock_contention
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> ⬢[acme@toolbox perf-tools-next]$ git am --abort
> ⬢[acme@toolbox perf-tools-next]$

Hi Arnaldo

The patch is picked up : 
https://lore.kernel.org/all/169757198796.167943.10552920255799914362.b4...@kernel.org/
 .
Thanks for looking into.

Athira




Re: [PATCH V3] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-10-22 Thread Athira Rajeev



> On 13-Oct-2023, at 1:06 PM, Athira Rajeev  wrote:
> 
> Add rule in new Makefile "tests/Makefile.tests" for running
> shellcheck on shell test scripts. This automates below shellcheck
> into the build.
> 
> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> warning $F; done
> 
> Condition for shellcheck is added in Makefile.perf to avoid build
> breakage in the absence of shellcheck binary. Update Makefile.perf
> to contain new rule for "SHELLCHECK_TEST" which is for making
> shellcheck test as a dependency on perf binary. Added
> "tests/Makefile.tests" to run shellcheck on shellscripts in
> tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
> time during make, shellcheck will be run only on modified files
> during subsequent invocations. By this, if any newly added shell
> scripts or fixes in existing scripts breaks coding/formatting
> style, it will get captured during the perf build.
> 
> Example build failure with present scripts in tests/shell:
> 
> INSTALL libsubcmd_headers
> INSTALL libperf_headers
> INSTALL libapi_headers
> INSTALL libsymbol_headers
> INSTALL libbpf_headers
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/record_sideband.dep] Error 1
> make[3]: *** Waiting for unfinished jobs
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/test_arm_coresight.dep] Error 1
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/lock_contention.dep] Error 1
> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
> make[1]: *** [Makefile.perf:238: sub-make] Error 2
> make: *** [Makefile:70: all] Error 2
> 
> After this, for testing, changed "tests/shell/record.sh" to
> break shellcheck format. In the next make run, it is
> also captured:
> 
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/record_sideband.dep] Error 1
> make[3]: *** Waiting for unfinished jobs
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/record.dep] Error 1
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/test_arm_coresight.dep] Error 1
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/lock_contention.dep] Error 1
> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
> make[1]: *** [Makefile.perf:238: sub-make] Error 2
> make: *** [Makefile:70: all] Error 2
> 
> The exact failure log can be found in:
> output/tests/shell/record.dep.log file
> 
> This is reported at build time. To be able to go ahead with
> the build or disable shellcheck even though it is known that
> some test is broken, add a "NO_SHELLCHECK" option. Example:
> 
>  make NO_LIBTRACEEVENT=1 NO_SHELLCHECK=1
> 
>  INSTALL libsubcmd_headers
>  INSTALL libsymbol_headers
>  INSTALL libperf_headers
>  INSTALL libapi_headers
>  INSTALL libbpf_headers
>  PERF_VERSION = 6.6.rc1.g7108a40e02ae
>  GEN perf-iostat
>  GEN perf-archive
>  CC  util/header.o
>  LD  util/perf-in.o
>  LD  perf-in.o
>  LINKperf
> 
> Signed-off-by: Athira Rajeev 
> ---
> changelog:
> v2 -> v3:
> Add option "NO_SHELLCHECK". This will allow to go ahead
> with the build or disable shellcheck even though it is
> known that some test is broken
> 
> v1 -> v2:
> Version1 had shellcheck in feature check which is not
> required since shellcheck is already a binary. Presence
> of binary can be checked using:
> $(shell command -v shellcheck)
> Addressed these changes as suggested by Namhyung in V2
> Feature test logic is removed in V2. Also added example
> for build breakage when shellcheck fails in commit message

Hi All,

Looking for review comments on this patch

Thanks
Athira
> 
> tools/perf/Makefile.perf| 20 +++-
> tools/perf/tests/Makefile.tests | 24 
> 2 files changed, 43 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/tests/Makefile.tests
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 456872ac410d..bb49eb8b0d43 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -134,6 +134,8 @@ include ../scripts/utilities.mak
> # x86 instruction decoder - new instructions test
> #
> # Define GEN_VMLINUX_H to generate vmlinux.h from the BTF.
> +#
> +# Define

[PATCH V3] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-10-13 Thread Athira Rajeev
Add rule in new Makefile "tests/Makefile.tests" for running
shellcheck on shell test scripts. This automates below shellcheck
into the build.

$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck 
-S warning $F; done

Condition for shellcheck is added in Makefile.perf to avoid build
breakage in the absence of shellcheck binary. Update Makefile.perf
to contain new rule for "SHELLCHECK_TEST" which is for making
shellcheck test as a dependency on perf binary. Added
"tests/Makefile.tests" to run shellcheck on shellscripts in
tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
time during make, shellcheck will be run only on modified files
during subsequent invocations. By this, if any newly added shell
scripts or fixes in existing scripts breaks coding/formatting
style, it will get captured during the perf build.

Example build failure with present scripts in tests/shell:

INSTALL libsubcmd_headers
INSTALL libperf_headers
INSTALL libapi_headers
INSTALL libsymbol_headers
INSTALL libbpf_headers
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record_sideband.dep] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/test_arm_coresight.dep] Error 1
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/lock_contention.dep] Error 1
make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
make[1]: *** [Makefile.perf:238: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

After this, for testing, changed "tests/shell/record.sh" to
break shellcheck format. In the next make run, it is
also captured:

make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record_sideband.dep] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record.dep] Error 1
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/test_arm_coresight.dep] Error 1
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/lock_contention.dep] Error 1
make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
make[1]: *** [Makefile.perf:238: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

The exact failure log can be found in:
output/tests/shell/record.dep.log file

This is reported at build time. To be able to go ahead with
the build or disable shellcheck even though it is known that
some test is broken, add a "NO_SHELLCHECK" option. Example:

  make NO_LIBTRACEEVENT=1 NO_SHELLCHECK=1

  INSTALL libsubcmd_headers
  INSTALL libsymbol_headers
  INSTALL libperf_headers
  INSTALL libapi_headers
  INSTALL libbpf_headers
  PERF_VERSION = 6.6.rc1.g7108a40e02ae
  GEN perf-iostat
  GEN perf-archive
  CC  util/header.o
  LD  util/perf-in.o
  LD  perf-in.o
  LINKperf

Signed-off-by: Athira Rajeev 
---
changelog:
 v2 -> v3:
 Add option "NO_SHELLCHECK". This will allow to go ahead
 with the build or disable shellcheck even though it is
 known that some test is broken

 v1 -> v2:
 Version1 had shellcheck in feature check which is not
 required since shellcheck is already a binary. Presence
 of binary can be checked using:
 $(shell command -v shellcheck)
 Addressed these changes as suggested by Namhyung in V2
 Feature test logic is removed in V2. Also added example
 for build breakage when shellcheck fails in commit message

 tools/perf/Makefile.perf| 20 +++-
 tools/perf/tests/Makefile.tests | 24 
 2 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/tests/Makefile.tests

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 456872ac410d..bb49eb8b0d43 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -134,6 +134,8 @@ include ../scripts/utilities.mak
 #  x86 instruction decoder - new instructions test
 #
 # Define GEN_VMLINUX_H to generate vmlinux.h from the BTF.
+#
+# Define NO_SHELLCHECK if you do not want to run shellcheck during build
 
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
@@ -671,7 +673,22 @@ $(PERF_IN): prepare FORCE
 $(PMU_EVENTS_IN): FORCE prepare
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events 
obj=pmu-events
 
-$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
+# Runs shellcheck on perf test shell scripts
+
+SHELLCHECK := $(shell command -v shellcheck)
+ifeq ($(NO_SHELLCHECK

[PATCH V2 0/3] Fix for shellcheck issues with latest scripts in tests/shell

2023-10-13 Thread Athira Rajeev
shellcheck was run on perf tool shell scripts as a pre-requisite
to include a build option for shellcheck discussed here:
https://www.spinics.net/lists/linux-perf-users/msg25553.html

And fixes were added for the coding/formatting issues in
two patchsets:
https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/

Three additional issues were observed and fixes are part of:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git

With recent commits in perf, other three issues are observed.
shellcheck version: 0.6.0
The changes are with recent commits ( which is mentioned in each patch)
for lock_contention, record_sideband and stat_all_metricgroups test.
Patch series fixes these testcases and patches are on top of:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git

The version 1 patchset had fix patch for test_arm_coresight.sh.
Dropping that in V2 based on discussion here:
https://lore.kernel.org/linux-perf-users/f265857d-0d37-4878-908c-d20732f21...@linux.vnet.ibm.com/T/#u

Athira Rajeev (3):
  tools/perf/tests Ignore the shellcheck SC2046 warning in
lock_contention
  tools/perf/tests: Fix shellcheck warning in record_sideband.sh test
  tools/perf/tests/shell: Fix shellcheck warning SC2112 with
stat_all_metricgroups

 tools/perf/tests/shell/lock_contention.sh   | 1 +
 tools/perf/tests/shell/record_sideband.sh   | 2 +-
 tools/perf/tests/shell/stat_all_metricgroups.sh | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

-- 
2.31.1



[PATCH V2 2/3] tools/perf/tests: Fix shellcheck warning in record_sideband.sh test

2023-10-13 Thread Athira Rajeev
Running shellcheck on record_sideband.sh throws below
warning:

In tests/shell/record_sideband.sh line 25:
  if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 
>/dev/null
^--^ SC2069: To redirect stdout+stderr, 2>&1 must be last (or use 
'{ cmd > file; } 2>&1' to clarify).

This shows shellcheck warning SC2069 where the redirection
order needs to be fixed. Use "cmd > /dev/null 2>&1" to fix
the redirection of perf record output

Fixes: 23b97c7ee963 ("perf test: Add test case for record sideband events")
Signed-off-by: Athira Rajeev 
Reviewed-by: Kajol Jain 
---
changelog:
 v1 -> v2:
 Add Reviewed-by from Kajol
 Used "cmd > /dev/null 2>&1" to fix the redirection
 warning from shellcheck

 tools/perf/tests/shell/record_sideband.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/record_sideband.sh 
b/tools/perf/tests/shell/record_sideband.sh
index 5024a7ce0c51..ac70ac27d590 100755
--- a/tools/perf/tests/shell/record_sideband.sh
+++ b/tools/perf/tests/shell/record_sideband.sh
@@ -22,7 +22,7 @@ trap trap_cleanup EXIT TERM INT
 
 can_cpu_wide()
 {
-if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 
>/dev/null
+if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true > /dev/null 
2>&1
 then
 echo "record sideband test [Skipped cannot record cpu$1]"
 err=2
-- 
2.31.1



[PATCH V2 1/3] tools/perf/tests Ignore the shellcheck SC2046 warning in lock_contention

2023-10-13 Thread Athira Rajeev
Running shellcheck on lock_contention.sh generates below
warning

In tests/shell/lock_contention.sh line 36:
   if [ `nproc` -lt 4 ]; then
  ^-^ SC2046: Quote this to prevent word splitting.

Here since nproc will generate a single word output
and there is no possibility of word splitting, this
warning can be ignored. Use exception for this with
"disable" option in shellcheck. This warning is observed
after commit:
"commit 29441ab3a30a ("perf test lock_contention.sh: Skip test
if not enough CPUs")"

Fixes: 29441ab3a30a ("perf test lock_contention.sh: Skip test if not enough 
CPUs")
Signed-off-by: Athira Rajeev 
Reviewed-by: Kajol Jain 
---
changelog:
 v1 -> v2:
 Add Reviewed-by from Kajol Jain

 tools/perf/tests/shell/lock_contention.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/tests/shell/lock_contention.sh 
b/tools/perf/tests/shell/lock_contention.sh
index d5a191d3d090..c1ec5762215b 100755
--- a/tools/perf/tests/shell/lock_contention.sh
+++ b/tools/perf/tests/shell/lock_contention.sh
@@ -33,6 +33,7 @@ check() {
exit
fi
 
+   # shellcheck disable=SC2046
if [ `nproc` -lt 4 ]; then
echo "[Skip] Low number of CPUs (`nproc`), lock event cannot be 
triggered certainly"
err=2
-- 
2.31.1



[PATCH V2 3/3] tools/perf/tests/shell: Fix shellcheck warning SC2112 with stat_all_metricgroups

2023-10-13 Thread Athira Rajeev
Running shellcheck on stat_all_metricgroups.sh reports
below warning:

 In ./tests/shell/stat_all_metricgroups.sh line 7:
 function ParanoidAndNotRoot()
 ^-- SC2112: 'function' keyword is non-standard. Delete it.

As per the format, "function" is a non-standard keyword that
can be used to declare functions. Fix this by removing the
"function" keyword from ParanoidAndNotRoot function

Signed-off-by: Athira Rajeev 
---
Changelog:
 This is a new patch added in V2

 tools/perf/tests/shell/stat_all_metricgroups.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh 
b/tools/perf/tests/shell/stat_all_metricgroups.sh
index f3e305649e2c..55ef9c9ded2d 100755
--- a/tools/perf/tests/shell/stat_all_metricgroups.sh
+++ b/tools/perf/tests/shell/stat_all_metricgroups.sh
@@ -4,7 +4,7 @@
 
 set -e
 
-function ParanoidAndNotRoot()
+ParanoidAndNotRoot()
 {
   [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt 
$1 ]
 }
-- 
2.31.1



Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh

2023-10-12 Thread Athira Rajeev



> On 12-Oct-2023, at 9:37 PM, Suzuki K Poulose  wrote:
> 
> Hi,
> 
> On 12/10/2023 16:56, Athira Rajeev wrote:
>>> On 05-Oct-2023, at 3:06 PM, Suzuki K Poulose  wrote:
>>> 
>>> On 05/10/2023 06:02, Namhyung Kim wrote:
>>>> On Thu, Sep 28, 2023 at 9:11 PM Athira Rajeev
>>>>  wrote:
>>>>> 
> 
> ...
> 
>>> Thanks for the fix.
>>> 
>>> Nothing to do with this patch, but I am wondering if the original patch
>>> is over engineered and may not be future proof.
>>> 
>>> e.g.,
>>> 
>>> cs_etm_dev_name() {
>>> + cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
>>> -print -quit)
>>> 
>>> Right there you got the device name and we can easily deduce the name of
>>> the "ETM" node.
>>> 
>>> e.g,:
>>> etm=$(basename $(readlink cs_etm_path) | sed "s/[0-9]\+$//")
>>> 
>>> And practically, nobody prevents an ETE mixed with an ETM on a "hybrid"
>>> system (hopefully, no one builds it ;-))
>>> 
>>> Also, instead of hardcoding "ete" and "etm" prefixes from the arch part,
>>> we should simply use the cpu nodes from :
>>> 
>>> /sys/bus/event_source/devices/cs_etm/
>>> 
>>> e.g.,
>>> 
>>> arm_cs_etm_traverse_path_test() {
>>> # Iterate for every ETM device
>>> for c in /sys/bus/event_source/devices/cs_etm/cpu*; do
>>> # Read the link to be on the safer side
>>> dev=`readlink $c`
>>> 
>>> # Find the ETM device belonging to which CPU
>>> cpu=`cat $dev/cpu`
>>> 
>>> # Use depth-first search (DFS) to iterate outputs
>>> arm_cs_iterate_devices $dev $cpu
>>> done;
>>> }
>>> 
>>> 
>>> 
>>>> You'd better add Coresight folks on this.
>>>> Maybe this file was missing in the MAINTAINERS file.
>>> 
>>> And the original author of the commit, that introduced the issue too.
>>> 
>>> Suzuki
>> Hi All,
>> Thanks for the discussion and feedbacks.
>> This patch fixes the shellcheck warning introduced in function 
>> "cs_etm_dev_name". But with the changes that Suzuki suggested, we won't need 
>> the function "cs_etm_dev_name" since the code will use 
>> "/sys/bus/event_source/devices/cs_etm/" .  In that case, can I drop this 
>> patch for now from this series ?
> 
> Yes, please. James will send out the proposed patch

Hi Suzuki,

Sure. Thanks! 

Athira
> 
> Suzuki
> 
> 



Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh

2023-10-12 Thread Athira Rajeev



> On 05-Oct-2023, at 3:06 PM, Suzuki K Poulose  wrote:
> 
> On 05/10/2023 06:02, Namhyung Kim wrote:
>> On Thu, Sep 28, 2023 at 9:11 PM Athira Rajeev
>>  wrote:
>>> 
>>> Running shellcheck on tests/shell/test_arm_coresight.sh
>>> throws below warnings:
>>> 
>>> In tests/shell/test_arm_coresight.sh line 15:
>>> cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 
>>> cpu* -print -quit)
>>>   ^--^ SC2061: Quote the parameter to -name so the shell 
>>> won't interpret it.
>>> 
>>> In tests/shell/test_arm_coresight.sh line 20:
>>> if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = 
>>> "0xA13" ] ; then
>>>  ^-- SC2166: Prefer [ p ] && [ q ] as [ 
>>> p -a q ] is not well defined
>>> 
>>> This warning is observed after commit:
>>> "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")"
>>> 
>>> Fixed this issue by using quoting 'cpu*' for SC2061 and
>>> using "&&" in line number 20 for SC2166 warning
>>> 
>>> Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")
>>> Signed-off-by: Athira Rajeev 
> 
> Thanks for the fix.
> 
> Nothing to do with this patch, but I am wondering if the original patch
> is over engineered and may not be future proof.
> 
> e.g.,
> 
> cs_etm_dev_name() {
> + cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* -print 
> -quit)
> 
> Right there you got the device name and we can easily deduce the name of
> the "ETM" node.
> 
> e.g,:
> etm=$(basename $(readlink cs_etm_path) | sed "s/[0-9]\+$//")
> 
> And practically, nobody prevents an ETE mixed with an ETM on a "hybrid"
> system (hopefully, no one builds it ;-))
> 
> Also, instead of hardcoding "ete" and "etm" prefixes from the arch part,
> we should simply use the cpu nodes from :
> 
> /sys/bus/event_source/devices/cs_etm/
> 
> e.g.,
> 
> arm_cs_etm_traverse_path_test() {
> # Iterate for every ETM device
> for c in /sys/bus/event_source/devices/cs_etm/cpu*; do
> # Read the link to be on the safer side
> dev=`readlink $c`
> 
> # Find the ETM device belonging to which CPU
> cpu=`cat $dev/cpu`
> 
> # Use depth-first search (DFS) to iterate outputs
> arm_cs_iterate_devices $dev $cpu
> done;
> }
> 
> 
> 
>> You'd better add Coresight folks on this.
>> Maybe this file was missing in the MAINTAINERS file.
> 
> And the original author of the commit, that introduced the issue too.
> 
> Suzuki

Hi All,
Thanks for the discussion and feedbacks.

This patch fixes the shellcheck warning introduced in function 
"cs_etm_dev_name". But with the changes that Suzuki suggested, we won't need 
the function "cs_etm_dev_name" since the code will use 
"/sys/bus/event_source/devices/cs_etm/" .  In that case, can I drop this patch 
for now from this series ?

Thanks
Athira

> 
>> Thanks,
>> Namhyung
>>> ---
>>>  tools/perf/tests/shell/test_arm_coresight.sh | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/tools/perf/tests/shell/test_arm_coresight.sh 
>>> b/tools/perf/tests/shell/test_arm_coresight.sh
>>> index fe78c4626e45..f2115dfa24a5 100755
>>> --- a/tools/perf/tests/shell/test_arm_coresight.sh
>>> +++ b/tools/perf/tests/shell/test_arm_coresight.sh
>>> @@ -12,12 +12,12 @@
>>>  glb_err=0
>>> 
>>>  cs_etm_dev_name() {
>>> -   cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 
>>> cpu* -print -quit)
>>> +   cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 
>>> 'cpu*' -print -quit)
>>> trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch)
>>> archhver=$((($trcdevarch >> 12) & 0xf))
>>> archpart=$(($trcdevarch & 0xfff))
>>> 
>>> -   if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] 
>>> ; then
>>> +   if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = 
>>> "0xA13" ] ; then
>>> echo "ete"
>>> else
>>> echo "etm"
>>> --
>>> 2.31.1




Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh

2023-10-12 Thread Athira Rajeev



> On 05-Oct-2023, at 9:15 PM, David Laight  wrote:
> 
> From: David Laight
>> Sent: 05 October 2023 11:16
> ...
>>> - cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
>>> -print -quit)
>>> + cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' 
>>> -print -quit)
>> 
>> Isn't the intention to get the shell to expand "cpu* ?
>> So quoting it completely breaks the script.
> 
> Complete brain-fade :-(

Hi David,

Yeah, quoting it also will expand

Thanks
Athira
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
> 



Re: [PATCH V2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-10-12 Thread Athira Rajeev



> On 09-Oct-2023, at 10:38 AM, Namhyung Kim  wrote:
> 
> Hello,
> 
> Sorry for the late reply.
> 
> On Thu, Oct 5, 2023 at 8:27 AM Athira Rajeev
>  wrote:
>> 
>> 
>> 
>>> On 29-Sep-2023, at 12:19 PM, Athira Rajeev  
>>> wrote:
>>> 
>>> Add rule in new Makefile "tests/Makefile.tests" for running
>>> shellcheck on shell test scripts. This automates below shellcheck
>>> into the build.
>>> 
>>> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>>> warning $F; done
>>> 
>>> Condition for shellcheck is added in Makefile.perf to avoid build
>>> breakage in the absence of shellcheck binary. Update Makefile.perf
>>> to contain new rule for "SHELLCHECK_TEST" which is for making
>>> shellcheck test as a dependency on perf binary. Added
>>> "tests/Makefile.tests" to run shellcheck on shellscripts in
>>> tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
>>> time during make, shellcheck will be run only on modified files
>>> during subsequent invocations. By this, if any newly added shell
>>> scripts or fixes in existing scripts breaks coding/formatting
>>> style, it will get captured during the perf build.
>>> 
>>> Example build failure with present scripts in tests/shell:
>>> 
>>> INSTALL libsubcmd_headers
>>> INSTALL libperf_headers
>>> INSTALL libapi_headers
>>> INSTALL libsymbol_headers
>>> INSTALL libbpf_headers
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/record_sideband.dep] Error 1
>>> make[3]: *** Waiting for unfinished jobs
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/test_arm_coresight.dep] Error 1
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/lock_contention.dep] Error 1
>>> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
>>> make[1]: *** [Makefile.perf:238: sub-make] Error 2
>>> make: *** [Makefile:70: all] Error 2
>>> 
>>> After this, for testing, changed "tests/shell/record.sh" to
>>> break shellcheck format. In the next make run, it is
>>> also captured:
> 
> Where can I see the actual failure messages?
Hi Namhyung,
Thanks for the review comments.

Example, with current tree, we have these format issues:

 GENSKEL /root/athira/linux/tools/perf/util/bpf_skel/kwork_trace.skel.h
CC bench/uprobe.o
CC util/header.o
LD bench/perf-in.o
make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/stat_all_metricgroups.dep] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record_sideband.dep] Error 1
CC util/bpf_counter.o
CC util/bpf_counter_cgroup.o
CC util/bpf_ftrace.o
CC util/bpf_off_cpu.o
CC util/bpf-filter.o
make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:15: 
output/tests/shell/test_arm_coresight.dep] Error 1
make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:15: 
output/tests/shell/lock_contention.dep] Error 1
make[2]: *** [Makefile.perf:679: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs
LD util/perf-in.o
LD perf-in.o
make[1]: *** [Makefile.perf:242: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

The actual fail can be seen here:

# cat output/tests/shell/stat_all_metricgroups.dep.log 

In ./tests/shell/stat_all_metricgroups.sh line 7:
function ParanoidAndNotRoot()
^-- SC2112: 'function' keyword is non-standard. Delete it.

For more information:
https://www.shellcheck.net/wiki/SC2112 -- 'function' keyword is non-standar...
> 


>>> 
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/record_sideband.dep] Error 1
>>> make[3]: *** Waiting for unfinished jobs
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/record.dep] Error 1
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/test_arm_coresight.dep] Error 1
>>> make[3]: *** 
>>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
>>> output/tests/shell/lock_contention.dep] Error 1
>>> make[2]: *** [Makefile.perf:675: SHELLCHECK_TE

[PATCH] tools/perf/arch/powerpc: Fix the CPU ID const char* value by adding 0x prefix

2023-10-08 Thread Athira Rajeev
Simple expression parser test fails in powerpc as below:

4: Simple expression parser
test child forked, pid 170385
Using CPUID 004e2102
division by zero
syntax error
syntax error
FAILED tests/expr.c:65 parse test failed
test child finished with -1
Simple expression parser: FAILED!

This is observed after commit:
'commit 9d5da30e4ae9 ("perf jevents: Add a new expression builtin 
strcmp_cpuid_str()")'

With this commit, a new expression builtin strcmp_cpuid_str
got added. This function takes an 'ID' type value, which is
a string. So expression parse for strcmp_cpuid_str expects
const char * as cpuid value type. In case of powerpc, CPU IDs
are numbers. Hence it doesn't get interpreted correctly by
bison parser. Example in case of power9, cpuid string returns
as: 004e2102

cpuid of string type is expected in two cases:
1. char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);

   Testcase "tests/expr.c" uses "perf_pmu__getcpuid" which calls
   get_cpuid_str to get the cpuid string.

2. cpuid field in  :struct pmu_events_map

   struct pmu_events_map {
   const char *arch;
   const char *cpuid;

   Here cpuid field is used in "perf_pmu__find_events_table"
   function as "strcmp_cpuid_str(map->cpuid, cpuid)". The
   value for cpuid field is picked from mapfile.csv.

Fix the mapfile.csv and get_cpuid_str function to prefix
cpuid with 0x so that it gets correctly interpreted by
the bison parser

Signed-off-by: Athira Rajeev 
---
 tools/perf/arch/powerpc/util/header.c  | 2 +-
 tools/perf/pmu-events/arch/powerpc/mapfile.csv | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index c8d0dc775e5d..6b00efd53638 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -34,7 +34,7 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
char *bufp;
 
-   if (asprintf(, "%.8lx", mfspr(SPRN_PVR)) < 0)
+   if (asprintf(, "0x%.8lx", mfspr(SPRN_PVR)) < 0)
bufp = NULL;
 
return bufp;
diff --git a/tools/perf/pmu-events/arch/powerpc/mapfile.csv 
b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
index a534ff6db14b..f4908af7ad66 100644
--- a/tools/perf/pmu-events/arch/powerpc/mapfile.csv
+++ b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
@@ -13,6 +13,6 @@
 #
 
 # Power8 entries
-004[bcd][[:xdigit:]]{4},1,power8,core
-004e[[:xdigit:]]{4},1,power9,core
-0080[[:xdigit:]]{4},1,power10,core
+0x004[bcd][[:xdigit:]]{4},1,power8,core
+0x004e[[:xdigit:]]{4},1,power9,core
+0x0080[[:xdigit:]]{4},1,power10,core
-- 
2.39.3



Re: [PATCH V2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-10-05 Thread Athira Rajeev



> On 29-Sep-2023, at 12:19 PM, Athira Rajeev  
> wrote:
> 
> Add rule in new Makefile "tests/Makefile.tests" for running
> shellcheck on shell test scripts. This automates below shellcheck
> into the build.
> 
> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> warning $F; done
> 
> Condition for shellcheck is added in Makefile.perf to avoid build
> breakage in the absence of shellcheck binary. Update Makefile.perf
> to contain new rule for "SHELLCHECK_TEST" which is for making
> shellcheck test as a dependency on perf binary. Added
> "tests/Makefile.tests" to run shellcheck on shellscripts in
> tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
> time during make, shellcheck will be run only on modified files
> during subsequent invocations. By this, if any newly added shell
> scripts or fixes in existing scripts breaks coding/formatting
> style, it will get captured during the perf build.
> 
> Example build failure with present scripts in tests/shell:
> 
> INSTALL libsubcmd_headers
> INSTALL libperf_headers
> INSTALL libapi_headers
> INSTALL libsymbol_headers
> INSTALL libbpf_headers
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/record_sideband.dep] Error 1
> make[3]: *** Waiting for unfinished jobs
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/test_arm_coresight.dep] Error 1
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/lock_contention.dep] Error 1
> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
> make[1]: *** [Makefile.perf:238: sub-make] Error 2
> make: *** [Makefile:70: all] Error 2
> 
> After this, for testing, changed "tests/shell/record.sh" to
> break shellcheck format. In the next make run, it is
> also captured:
> 
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/record_sideband.dep] Error 1
> make[3]: *** Waiting for unfinished jobs
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/record.dep] Error 1
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/test_arm_coresight.dep] Error 1
> make[3]: *** 
> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
> output/tests/shell/lock_contention.dep] Error 1
> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
> make[1]: *** [Makefile.perf:238: sub-make] Error 2
> make: *** [Makefile:70: all] Error 2
> 
> Signed-off-by: Athira Rajeev 
> ---
> Changelog:
> v1 -> v2:
> Version1 had shellcheck in feature check which is not
> required since shellcheck is already a binary. Presence
> of binary can be checked using:
> $(shell command -v shellcheck)
> Addressed these changes as suggested by Namhyung in V2
> Feature test logic is removed in V2. Also added example
> for build breakage when shellcheck fails in commit message

Hi All,

Looking for feedback on this patch

Thanks
Athira
> 
> tools/perf/Makefile.perf| 14 +-
> tools/perf/tests/Makefile.tests | 24 
> 2 files changed, 37 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/tests/Makefile.tests
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 98604e396ac3..56a66ca253ab 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -667,7 +667,18 @@ $(PERF_IN): prepare FORCE
> $(PMU_EVENTS_IN): FORCE prepare
> $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events 
> obj=pmu-events
> 
> -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
> +# Runs shellcheck on perf test shell scripts
> +
> +SHELLCHECK := $(shell command -v shellcheck)
> +ifneq ($(SHELLCHECK),)
> +SHELLCHECK_TEST: FORCE prepare
> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests
> +else
> +SHELLCHECK_TEST:
> + @:
> +endif
> +
> +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST
> $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \
> $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
> 
> @@ -1130,6 +1141,7 @@ bpf-skel-clean:
> $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
> 
> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean 
> $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean 
> tests-coresight-targets-clean
> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
> $(

Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh

2023-10-05 Thread Athira Rajeev



> On 05-Oct-2023, at 1:50 PM, James Clark  wrote:
> 
> 
> 
> On 29/09/2023 05:11, Athira Rajeev wrote:
>> Running shellcheck on tests/shell/test_arm_coresight.sh
>> throws below warnings:
>> 
>> In tests/shell/test_arm_coresight.sh line 15:
>> cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* -print 
>> -quit)
>>  ^--^ SC2061: Quote the parameter to -name so the shell 
>> won't interpret it.
>> 
>> In tests/shell/test_arm_coresight.sh line 20:
>> if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then
>>  ^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q ] 
>> is not well defined
>> 
>> This warning is observed after commit:
>> "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")"
>> 
>> Fixed this issue by using quoting 'cpu*' for SC2061 and
>> using "&&" in line number 20 for SC2166 warning
>> 
>> Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")
>> Signed-off-by: Athira Rajeev 
>> ---
>> tools/perf/tests/shell/test_arm_coresight.sh | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/perf/tests/shell/test_arm_coresight.sh 
>> b/tools/perf/tests/shell/test_arm_coresight.sh
>> index fe78c4626e45..f2115dfa24a5 100755
>> --- a/tools/perf/tests/shell/test_arm_coresight.sh
>> +++ b/tools/perf/tests/shell/test_arm_coresight.sh
>> @@ -12,12 +12,12 @@
>> glb_err=0
>> 
>> cs_etm_dev_name() {
>> - cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
>> -print -quit)
>> + cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' 
>> -print -quit)
>> trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch)
>> archhver=$((($trcdevarch >> 12) & 0xf))
>> archpart=$(($trcdevarch & 0xfff))
>> 
>> - if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then
>> + if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; 
>> then
>> echo "ete"
>> else
>> echo "etm"
> 
> 
> Reviewed-by: James Clark 

Thanks James for checking

Athira




Re: [PATCH 3/3] tools/perf/tests: Fix shellcheck warning in record_sideband.sh test

2023-10-05 Thread Athira Rajeev



> On 05-Oct-2023, at 10:34 AM, Namhyung Kim  wrote:
> 
> On Thu, Sep 28, 2023 at 9:11 PM Athira Rajeev
>  wrote:
>> 
>> Running shellcheck on record_sideband.sh throws below
>> warning:
>> 
>>In tests/shell/record_sideband.sh line 25:
>>  if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 
>> >/dev/null
>>^--^ SC2069: To redirect stdout+stderr, 2>&1 must be last (or use 
>> '{ cmd > file; } 2>&1' to clarify).
>> 
>> This shows shellcheck warning SC2069 where the redirection
>> order needs to be fixed. Use { cmd > file; } 2>&1 to fix the
>> redirection of perf record output
>> 
>> Fixes: 23b97c7ee963 ("perf test: Add test case for record sideband events")
>> Signed-off-by: Athira Rajeev 
>> ---
>> tools/perf/tests/shell/record_sideband.sh | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/tools/perf/tests/shell/record_sideband.sh 
>> b/tools/perf/tests/shell/record_sideband.sh
>> index 5024a7ce0c51..7e036763a43c 100755
>> --- a/tools/perf/tests/shell/record_sideband.sh
>> +++ b/tools/perf/tests/shell/record_sideband.sh
>> @@ -22,7 +22,7 @@ trap trap_cleanup EXIT TERM INT
>> 
>> can_cpu_wide()
>> {
>> -if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 
>> >/dev/null
>> +if ! { perf record -o ${perfdata} -BN --no-bpf-event -C $1 true > 
>> /dev/null; } 2>&1
> 
> I think we usually go without braces.

Hi Namhyung

Thanks for reviving.I will fix this in V2

Thanks
Athira
> 
> Thanks,
> Namhyung
> 
> 
>> then
>> echo "record sideband test [Skipped cannot record cpu$1]"
>> err=2
>> --
>> 2.31.1




Re: [PATCH V5 1/3] tools/perf: Add text_end to "struct dso" to save .text section size

2023-10-03 Thread Athira Rajeev



> On 03-Oct-2023, at 9:58 AM, Namhyung Kim  wrote:
> 
> Hello,
> 
> On Thu, Sep 28, 2023 at 12:52 AM Athira Rajeev
>  wrote:
>> 
>> Update "struct dso" to include new member "text_end".
>> This new field will represent the offset for end of text
>> section for a dso. For elf, this value is derived as:
>> sh_size (Size of section in byes) + sh_offset (Section file
>> offst) of the elf header for text.
>> 
>> For bfd, this value is derived as:
>> 1. For PE file,
>> section->size + ( section->vma - dso->text_offset)
>> 2. Other cases:
>> section->filepos (file position) + section->size (size of
>> section)
>> 
>> To resolve the address from a sample, perf looks at the
>> DSO maps. In case of address from a kernel module, there
>> were some address found to be not resolved. This was
>> observed while running perf test for "Object code reading".
>> Though the ip falls beteen the start address of the loaded
>> module (perf map->start ) and end address ( perf map->end),
>> it was unresolved.
>> 
>> Example:
>> 
>>Reading object code for memory address: 0xc00807f0142c
>>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>On file address is: 0x1114cc
>>Objdump command is: objdump -z -d --start-address=0x11142c 
>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>objdump read too few bytes: 128
>>test child finished with -1
>> 
>> Here, module is loaded at:
>># cat /proc/modules | grep xfs
>>xfs 2228224 3 - Live 0xc00807d0
>> 
>> From objdump for xfs module, text section is:
>>text 0010f7bc    00a0 2**4
>> 
>> Here the offset for 0xc00807f0142c ie  0x112074 falls out
>> .text section which is up to 0x10f7bc.
>> 
>> In this case for module, the address 0xc00807e11fd4 is pointing
>> to stub instructions. This address range represents the module stubs
>> which is allocated on module load and hence is not part of DSO offset.
>> 
>> To identify such  address, which falls out of text
>> section and within module end, added the new field "text_end" to
>> "struct dso".
>> 
>> Reported-by: Disha Goel 
>> Signed-off-by: Athira Rajeev 
>> Reviewed-by: Adrian Hunter 
>> Reviewed-by: Kajol Jain 
> 
> For the series,
> Acked-by: Namhyung Kim 
> 
> Thanks,
> Namhyung

Thanks for checking Namhyung,

Athira
> 
> 
>> ---
>> Changelog:
>> v2 -> v3:
>> Added Reviewed-by from Adrian
>> 
>> v1 -> v2:
>> Added text_end for bfd also by updating dso__load_bfd_symbols
>> as suggested by Adrian.
>> 
>> tools/perf/util/dso.h| 1 +
>> tools/perf/util/symbol-elf.c | 4 +++-
>> tools/perf/util/symbol.c | 2 ++
>> 3 files changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index b41c9782c754..70fe0fe69bef 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -181,6 +181,7 @@ struct dso {
>>u8   rel;
>>struct build_id  bid;
>>u64  text_offset;
>> +   u64  text_end;
>>const char   *short_name;
>>const char   *long_name;
>>u16  long_name_len;
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 95e99c332d7e..9e7eeaf616b8 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map 
>> *map, struct symsrc *syms_ss,
>>}
>> 
>>if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, ,
>> -   ".text", NULL))
>> +   ".text", NULL)) {
>>dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
>> +   dso->text_end = tshdr.sh_offset + tshdr.sh_size;
>> +   }
>> 
>>if (runtime_ss->opdsec)
>>opddata = elf_rawdata(runtime_ss->opdsec, NULL);
>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>> index 3f36675b7c8f..f25e4e62cf25 100644
>> --- a/tools/perf/util/symbol.c
>> +++ b/tools/perf/util/symbol.c
>> @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char 
>> *debugfile)
>>/* PE symbols can only have 4 bytes, so use .text 
>> high bits */
>>dso->text_offset = section->vma - (u32)section->vma;
>>dso->text_offset += 
>> (u32)bfd_asymbol_value(symbols[i]);
>> +   dso->text_end = (section->vma - dso->text_offset) + 
>> section->size;
>>} else {
>>dso->text_offset = section->vma - section->filepos;
>> +   dso->text_end = section->filepos + section->size;
>>}
>>}
>> 
>> --
>> 2.31.1




Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-30 Thread Athira Rajeev



> On 28-Sep-2023, at 9:24 AM, Namhyung Kim  wrote:
> 
> On Tue, Sep 26, 2023 at 9:29 PM Athira Rajeev
>  wrote:
>> 
>> 
>> 
>>> On 25-Sep-2023, at 1:34 PM, kajoljain  wrote:
>>> 
>>> 
>>> 
>>> On 9/7/23 22:45, Athira Rajeev wrote:
>>>> From: root 
>>>> 
>>>> shellcheck was run on perf tool shell scripts s a pre-requisite
>>>> to include a build option for shellcheck discussed here:
>>>> https://www.spinics.net/lists/linux-perf-users/msg25553.html
>>>> 
>>>> And fixes were added for the coding/formatting issues in
>>>> two patchsets:
>>>> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
>>>> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/
>>>> 
>>>> Three additional issues are observed with shellcheck "0.6" and
>>>> this patchset covers those. With this patchset,
>>>> 
>>>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>>>> warning $F; done
>>>> # echo $?
>>>> 0
>>>> 
>>> 
>>> Patchset looks good to me.
>>> 
>>> Reviewed-by: Kajol Jain 
>>> 
>>> Thanks,
>>> Kajol Jain
>>> 
>> 
>> Hi Namhyunbg,
>> 
>> Can you please check for this patchset also
> 
> Sure, it's applied to perf-tools-next, thanks!

Thanks Namhyung

Athira




Re: [PATCH V3] perf test: Fix parse-events tests to skip parametrized events

2023-09-30 Thread Athira Rajeev



> On 30-Sep-2023, at 11:23 AM, Namhyung Kim  wrote:
> 
> On Wed, Sep 27, 2023 at 11:17 AM Athira Rajeev
>  wrote:
>> 
>> Testcase "Parsing of all PMU events from sysfs" parse events for
>> all PMUs, and not just cpu. In case of powerpc, the PowerVM
>> environment supports events from hv_24x7 and hv_gpci PMU which
>> is of example format like below:
>> 
>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
>> - hv_gpci/event,partition_id=?/
>> 
>> The value for "?" needs to be filled in depending on system
>> configuration. It is better to skip these parametrized events
>> in this test as it is done in:
>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
>> parametrized events")' which handled a simialr instance with
>> "all PMU test".
>> 
>> Fix parse-events test to skip parametrized events since
>> it needs proper setup of the parameters.
>> 
>> Signed-off-by: Athira Rajeev 
>> Tested-by: Ian Rogers 
>> Tested-by: Sachin Sant 
>> Reviewed-by: Kajol Jain 
> 
> Applied to perf-tools-next, thanks!

Thanks Namhyung,

Athira




[PATCH 1/2] powerpc/platforms/pseries: Fix STK_PARAM access in the hcall tracing code

2023-09-29 Thread Athira Rajeev
In powerpc pseries system, below behaviour is observed while
enabling tracing on hcall:
# cd /sys/kernel/debug/tracing/
# cat events/powerpc/hcall_exit/enable
0
# echo 1 > events/powerpc/hcall_exit/enable

# ls
-bash: fork: Bad address

Above is from power9 lpar with latest kernel. Past this, softlockup
is observed. Initially while attempting via perf_event_open to
use "PERF_TYPE_TRACEPOINT", kernel panic was observed.

perf config used:

memset([1],0,sizeof(struct perf_event_attr));
pe[1].type=PERF_TYPE_TRACEPOINT;
pe[1].size=96;
pe[1].config=0x26ULL; /* 38 raw_syscalls/sys_exit */
pe[1].sample_type=0; /* 0 */

pe[1].read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP|0x10ULL;
 /* 1f */
pe[1].inherit=1;
pe[1].precise_ip=0; /* arbitrary skid */
pe[1].wakeup_events=0;
pe[1].bp_type=HW_BREAKPOINT_EMPTY;
pe[1].config1=0x1ULL;

Kernel panic logs:
==

Kernel attempted to read user page (8) - exploit attempt? (uid: 0)
 BUG: Kernel NULL pointer dereference on read at 0x0008
 Faulting instruction address: 0xc04c2814
 Oops: Kernel access of bad area, sig: 11 [#1]
 LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: nfnetlink bonding tls rfkill sunrpc dm_service_time 
dm_multipath pseries_rng xts vmx_crypto xfs libcrc32c sd_mod t10_pi 
crc64_rocksoft crc64 sg ibmvfc scsi_transport_fc ibmveth dm_mirror 
dm_region_hash dm_log dm_mod fuse
CPU: 0 PID: 1431 Comm: login Not tainted 6.4.0+ #1
Hardware name: IBM,8375-42A POWER9 (raw) 0x4e0202 0xf05 
of:IBM,FW950.30 (VL950_892) hv:phyp pSeries
NIP [c04c2814] page_remove_rmap+0x44/0x320
LR [c049c2a4] wp_page_copy+0x384/0xec0
Call Trace:
[c98c7ad0] [c0001416e400] 0xc0001416e400 (unreliable)
[c98c7b20] [c049c2a4] wp_page_copy+0x384/0xec0
[c98c7bf0] [c04a4f64] __handle_mm_fault+0x9d4/0xfb0
[c98c7cf0] [c04a5630] handle_mm_fault+0xf0/0x350
[c98c7d40] [c0094e8c] ___do_page_fault+0x48c/0xc90
[c98c7df0] [c00958a0] hash__do_page_fault+0x30/0x70
[c98c7e20] [c009e244] do_hash_fault+0x1a4/0x330
[c98c7e50] [c0008918] 
data_access_common_virt+0x198/0x1f0
 --- interrupt: 300 at 0x7fffae971abc

git bisect tracked this down to below commit:
'commit baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead")'

This commit changed STACK_FRAME_OVERHEAD (112 ) to
STACK_FRAME_MIN_SIZE (32 ) since 32 bytes is the minimum size
for ELFv2 stack. With the latest kernel, when running on ELFv2,
STACK_FRAME_MIN_SIZE is used to allocate stack size.

During plpar_hcall_trace, first call is made to HCALL_INST_PRECALL
which saves the registers and allocates new stack frame. In the
plpar_hcall_trace code, STK_PARAM is accessed at two places.
1. To save r4: std r4,STK_PARAM(R4)(r1)
2. To access r4 back: ld  r12,STK_PARAM(R4)(r1)

HCALL_INST_PRECALL precall allocates a new stack frame. So all
the stack parameter access after the precall, needs to be accessed
with +STACK_FRAME_MIN_SIZE. So the store instruction should be:
std r4,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1)

If the "std" is not updated with STACK_FRAME_MIN_SIZE, we will
end up with overwriting stack contents and cause corruption.
But instead of updating 'std', we can instead remove it since
HCALL_INST_PRECALL already saves it to the correct location.

similarly load instruction should be:
ld  r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1)

Fix the load instruction to correctly access the stack parameter
with +STACK_FRAME_MIN_SIZE and remove the store of r4 since the
precall saves it correctly.

Cc: sta...@vger.kernel.org
Fixes: baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead")
Co-developed-by: Naveen N Rao 
Signed-off-by: Naveen N Rao 
Signed-off-by: Athira Rajeev 
---
 arch/powerpc/platforms/pseries/hvCall.S | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hvCall.S 
b/arch/powerpc/platforms/pseries/hvCall.S
index bae45b358a09..2addf2ea03f0 100644
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -184,7 +184,6 @@ _GLOBAL_TOC(plpar_hcall)
 plpar_hcall_trace:
HCALL_INST_PRECALL(R5)
 
-   std r4,STK_PARAM(R4)(r1)
mr  r0,r4
 
mr  r4,r5
@@ -196,7 +195,7 @@ plpar_hcall_trace:
 
HVSC
 
-   ld  r12,STK_PARAM(R4)(r1)
+   ld  r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1)
std r4,0(r12)
std r5,8(r12)
std r6,16(r12)
@

[PATCH 2/2] powerpc/platforms/pseries: Remove unused r0 in the hcall tracing code

2023-09-29 Thread Athira Rajeev
In the plpar_hcall trace code, currently we use r0
to store the ORed result of r4. But this value is not
used subsequently in the code. Hence remove this unused
save to r0 in plpar_hcall and plpar_hcall9

Suggested-by: Naveen N Rao 
Signed-off-by: Athira Rajeev 
---
 arch/powerpc/platforms/pseries/hvCall.S | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hvCall.S 
b/arch/powerpc/platforms/pseries/hvCall.S
index 2addf2ea03f0..2b0cac6fb61f 100644
--- a/arch/powerpc/platforms/pseries/hvCall.S
+++ b/arch/powerpc/platforms/pseries/hvCall.S
@@ -184,8 +184,6 @@ _GLOBAL_TOC(plpar_hcall)
 plpar_hcall_trace:
HCALL_INST_PRECALL(R5)
 
-   mr  r0,r4
-
mr  r4,r5
mr  r5,r6
mr  r6,r7
@@ -295,8 +293,6 @@ _GLOBAL_TOC(plpar_hcall9)
 plpar_hcall9_trace:
HCALL_INST_PRECALL(R5)
 
-   mr  r0,r4
-
mr  r4,r5
mr  r5,r6
mr  r6,r7
-- 
2.39.3



[PATCH V2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-09-29 Thread Athira Rajeev
Add rule in new Makefile "tests/Makefile.tests" for running
shellcheck on shell test scripts. This automates below shellcheck
into the build.

$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck 
-S warning $F; done

Condition for shellcheck is added in Makefile.perf to avoid build
breakage in the absence of shellcheck binary. Update Makefile.perf
to contain new rule for "SHELLCHECK_TEST" which is for making
shellcheck test as a dependency on perf binary. Added
"tests/Makefile.tests" to run shellcheck on shellscripts in
tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every
time during make, shellcheck will be run only on modified files
during subsequent invocations. By this, if any newly added shell
scripts or fixes in existing scripts breaks coding/formatting
style, it will get captured during the perf build.

Example build failure with present scripts in tests/shell:

INSTALL libsubcmd_headers
INSTALL libperf_headers
INSTALL libapi_headers
INSTALL libsymbol_headers
INSTALL libbpf_headers
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record_sideband.dep] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/test_arm_coresight.dep] Error 1
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/lock_contention.dep] Error 1
make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
make[1]: *** [Makefile.perf:238: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

After this, for testing, changed "tests/shell/record.sh" to
break shellcheck format. In the next make run, it is
also captured:

make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record_sideband.dep] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/record.dep] Error 1
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/test_arm_coresight.dep] Error 1
make[3]: *** 
[/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: 
output/tests/shell/lock_contention.dep] Error 1
make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2
make[1]: *** [Makefile.perf:238: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

Signed-off-by: Athira Rajeev 
---
Changelog:
 v1 -> v2:
 Version1 had shellcheck in feature check which is not
 required since shellcheck is already a binary. Presence
 of binary can be checked using:
 $(shell command -v shellcheck)
 Addressed these changes as suggested by Namhyung in V2
 Feature test logic is removed in V2. Also added example
 for build breakage when shellcheck fails in commit message

 tools/perf/Makefile.perf| 14 +-
 tools/perf/tests/Makefile.tests | 24 
 2 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/tests/Makefile.tests

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 98604e396ac3..56a66ca253ab 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -667,7 +667,18 @@ $(PERF_IN): prepare FORCE
 $(PMU_EVENTS_IN): FORCE prepare
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events 
obj=pmu-events
 
-$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
+# Runs shellcheck on perf test shell scripts
+
+SHELLCHECK := $(shell command -v shellcheck)
+ifneq ($(SHELLCHECK),)
+SHELLCHECK_TEST: FORCE prepare
+   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests
+else
+SHELLCHECK_TEST:
+   @:
+endif
+
+$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST
$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \
$(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
 
@@ -1130,6 +1141,7 @@ bpf-skel-clean:
$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
 
 clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean 
$(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean 
tests-coresight-targets-clean
+   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) 
$(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS)
$(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' 
-delete -o -name '\.*.d' -delete
$(Q)$(RM) $(OUTPUT).config-detected
diff --git a/tools/perf/tests/Makefile.tests b/tools/perf/tests/Makefile.tests
new file mode 100644
index ..8011e99768a3
--- /dev/null
+++ b/tools/perf/

Re: [PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-28 Thread Athira Rajeev



> On 27-Sep-2023, at 8:25 PM, Athira Rajeev  wrote:
> 
> 
> 
>> On 27-Sep-2023, at 5:45 AM, Namhyung Kim  wrote:
>> 
>> On Thu, Sep 14, 2023 at 10:40 PM Athira Rajeev
>>  wrote:
>>> 
>>> The testcase "Object code reading" fails in somecases
>>> for "fs_something" sub test as below:
>>> 
>>>   Reading object code for memory address: 0xc00807f0142c
>>>   File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>>   On file address is: 0x1114cc
>>>   Objdump command is: objdump -z -d --start-address=0x11142c 
>>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>>   objdump read too few bytes: 128
>>>   test child finished with -1
>>> 
>>> This can alo be reproduced when running perf record with
>>> workload that exercises fs_something() code. In the test
>>> setup, this is exercising xfs code since root is xfs.
>>> 
>>>   # perf record ./a.out
>>>   # perf report -v |grep "xfs.ko"
>>> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>>> 0xc00807de5efc B [k] xlog_cil_commit
>>> 0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>>> 0xc00807d5ae18 B [k] xfs_btree_key_offset
>>> 0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>>> 0xc00807e11fd4 B [k] 0x00112074
>>> 
>>> Here addr "0xc00807e11fd4" is not resolved. since this is a
>>> kernel module, its offset is from the DSO. Xfs module is loaded
>>> at 0xc00807d0
>>> 
>>>  # cat /proc/modules | grep xfs
>>>   xfs 2228224 3 - Live 0xc00807d0
>>> 
>>> And size is 0x22. So its loaded between  0xc00807d0
>>> and 0xc00807f2. From objdump, text section is:
>>>   text 0010f7bc    00a0 2**4
>>> 
>>> Hence perf captured ip maps to 0x112074 which is:
>>> ( ip - start of module ) + a0
>>> 
>>> This offset 0x112074 falls out .text section which is up to 0x10f7bc
>>> In this case for module, the address 0xc00807e11fd4 is pointing
>>> to stub instructions. This address range represents the module stubs
>>> which is allocated on module load and hence is not part of DSO offset.
>>> 
>>> To address this issue in "object code reading", skip the sample if
>>> address falls out of text section and is within the module end.
>>> Use the "text_end" member of "struct dso" to do this check.
>>> 
>>> To address this issue in "perf report", exploring an option of
>>> having stubs range as part of the /proc/kallsyms, so that perf
>>> report can resolve addresses in stubs range
>>> 
>>> However this patch uses text_end to skip the stub range for
>>> Object code reading testcase.
>>> 
>>> Reported-by: Disha Goel 
>>> Signed-off-by: Athira Rajeev 
>>> Tested-by: Disha Goel
>>> Reviewed-by: Adrian Hunter 
>>> ---
>>> Changelog:
>>> v3 -> v4:
>>> Fixed indent in V3
>>> 
>>> v2 -> v3:
>>> Used strtailcmp in comparison for module check and added Reviewed-by
>>> from Adrian, Tested-by from Disha.
>>> 
>>> v1 -> v2:
>>> Updated comment to add description on which arch has stub and
>>> reason for skipping as suggested by Adrian
>>> 
>>> tools/perf/tests/code-reading.c | 10 ++
>>> 1 file changed, 10 insertions(+)
>>> 
>>> diff --git a/tools/perf/tests/code-reading.c 
>>> b/tools/perf/tests/code-reading.c
>>> index ed3815163d1b..9e6e6c985840 100644
>>> --- a/tools/perf/tests/code-reading.c
>>> +++ b/tools/perf/tests/code-reading.c
>>> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
>>> cpumode,
>>>   if (addr + len > map__end(al.map))
>>>   len = map__end(al.map) - addr;
>>> 
>>> +   /*
>>> +* Some architectures (ex: powerpc) have stubs (trampolines) in 
>>> kernel
>>> +* modules to manage long jumps. Check if the ip offset falls in 
>>> stubs
>>> +* sections for kernel modules. And skip module address after text 
>>> end
>>> +*/
>>> +   if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) {
>> 
>> There's a is_kernel_module() that can check compressed modules
>> too but I think we need a simpler way to check it like dso->kernel.
>> 
>> Thanks,
>> Namhyung
> 
> Thanks for the comment Namhyung. I will add similar to dso->kernel, another 
> field check in next version of patchset
> 
> Athira

Hi Namhyung,

I have posted a V5 for this:
https://lore.kernel.org/linux-perf-users/20230928075213.84392-1-atraj...@linux.vnet.ibm.com/T/#t

Thanks
Athira
>> 
>> 
>>> +   pr_debug("skipping the module address %#"PRIx64" after text 
>>> end\n", al.addr);
>>> +   goto out;
>>> +   }
>>> +
>>>   /* Read the object code using perf */
>>>   ret_len = dso__data_read_offset(dso, 
>>> maps__machine(thread__maps(thread)),
>>>   al.addr, buf1, len);
>>> --
>>> 2.31.1




Re: [PATCH 2/2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-09-28 Thread Athira Rajeev



> On 27-Sep-2023, at 9:55 AM, Athira Rajeev  wrote:
> 
> 
> 
>> On 27-Sep-2023, at 5:25 AM, Namhyung Kim  wrote:
>> 
>> On Thu, Sep 14, 2023 at 10:18 AM Athira Rajeev
>>  wrote:
>>> 
>>> Add rule in new Makefile "tests/Makefile.tests" for running
>>> shellcheck on shell test scripts. This automates below shellcheck
>>> into the build.
>>> 
>>>   $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do 
>>> shellcheck -S warning $F; done
>> 
>> I think you can do it if $(shell command -v shellcheck) returns
>> non-empty string (the path to the shellcheck).  Then the feature
>> test logic can be gone.
> 
> Ok, I will try this.
>> 
>>> 
>>> CONFIG_SHELLCHECK check is added to avoid build breakage in
>>> the absence of shellcheck binary. Update Makefile.perf to contain
>>> new rule for "SHELLCHECK_TEST" which is for making shellcheck
>>> test as a dependency on perf binary. Added "tests/Makefile.tests"
>>> to run shellcheck on shellscripts in tests/shell. The make rule
>>> "SHLLCHECK_RUN" ensures that, every time during make, shellcheck
>>> will be run only on modified files during subsequent invocations.
>>> By this, if any newly added shell scripts or fixes in existing
>>> scripts breaks coding/formatting style, it will get captured
>>> during the perf build.
>> 
>> Can you show me the example output?
> 
> Sure, I will add it.
>> 
>>> 
>>> Signed-off-by: Athira Rajeev 
>>> ---
>>> tools/perf/Makefile.perf| 12 +++-
>>> tools/perf/tests/Makefile.tests | 24 
>>> 2 files changed, 35 insertions(+), 1 deletion(-)
>>> create mode 100644 tools/perf/tests/Makefile.tests
>>> 
>>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>>> index f6fdc2d5a92f..c27f54771e90 100644
>>> --- a/tools/perf/Makefile.perf
>>> +++ b/tools/perf/Makefile.perf
>>> @@ -667,7 +667,16 @@ $(PERF_IN): prepare FORCE
>>> $(PMU_EVENTS_IN): FORCE prepare
>>>   $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events 
>>> obj=pmu-events
>>> 
>>> -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
>>> +# Runs shellcheck on perf test shell scripts
>>> +ifeq ($(CONFIG_SHELLCHECK),y)
>>> +SHELLCHECK_TEST: FORCE prepare
>>> +   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests
>>> +else
>>> +SHELLCHECK_TEST:
>>> +   @:
>>> +endif
>>> +
>>> +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST
>>>   $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \
>>>   $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
>>> 
>>> @@ -1129,6 +1138,7 @@ bpf-skel-clean:
>>>   $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>>> 
>>> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean 
>>> $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean 
>>> bpf-skel-clean tests-coresight-targets-clean
>>> +   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
>>>   $(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) 
>>> $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>>>   $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' 
>>> -delete -o -name '\.*.d' -delete
>>>   $(Q)$(RM) $(OUTPUT).config-detected
>>> diff --git a/tools/perf/tests/Makefile.tests 
>>> b/tools/perf/tests/Makefile.tests
>>> new file mode 100644
>>> index ..e74575559e83
>>> --- /dev/null
>>> +++ b/tools/perf/tests/Makefile.tests
>>> @@ -0,0 +1,24 @@
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Athira Rajeev , 2023
>>> +-include $(OUTPUT).config-detected
>>> +
>>> +log_file = $(OUTPUT)shellcheck_test.log
>>> +PROGS = $(subst ./,,$(shell find tests/shell -perm -o=x -type f -name 
>>> '*.sh'))
>>> +DEPS = $(addprefix output/,$(addsuffix .dep,$(basename $(PROGS
>>> +DIRS = $(shell echo $(dir $(DEPS)) | xargs -n1 | sort -u | xargs)
>>> +
>>> +.PHONY: all
>>> +all: SHELLCHECK_RUN
>>> +   @:
>>> +
>>> +SHELLCHECK_RUN: $(DEPS) $(DIRS)
>>> +
>>> +output/%.dep: %.sh | $(DIRS)
>>> +   $(call rule_mkdir)
>>> +   $(Q)$(call frecho-cmd,test)@touch $@
>>> +   $(Q)$(call frecho-cmd,test)@shellcheck -S warning $(subst 
>>> output/,./,$(patsubst %.dep, %.sh, $@)) 1> ${log_file} && ([[ ! -s 
>>> ${log_file} ]])
>> 
>> This line is too long, please wrap it with some backslashes.
> Ok
> 
> I will address all the comments in next version


Hi Namhyung,

While working on V2 for the Makefile changes and testing, came across three 
issues with latest scripts in perf-tools-next.
I have addressed those in below patchset:

https://lore.kernel.org/linux-perf-users/20230929041133.95355-1-atraj...@linux.vnet.ibm.com/T/#m7b3dc8a96467058e1b392183190baed47ae0eb75
[PATCH 0/3] Fix for shellcheck issues with latest scripts in tests/shell

For the Makefile.perf changes, I will send V2 separately addressing review 
comments

Thanks
Athira

> 
> Thanks
> Athira
>> 
>> Thanks,
>> Namhyung
>> 
>> 
>>> +$(DIRS):
>>> +   @mkdir -p $@
>>> +
>>> +clean:
>>> +   @rm -rf $(log_file) output
>>> --
>>> 2.31.1




[PATCH 3/3] tools/perf/tests: Fix shellcheck warning in record_sideband.sh test

2023-09-28 Thread Athira Rajeev
Running shellcheck on record_sideband.sh throws below
warning:

In tests/shell/record_sideband.sh line 25:
  if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 
>/dev/null
^--^ SC2069: To redirect stdout+stderr, 2>&1 must be last (or use 
'{ cmd > file; } 2>&1' to clarify).

This shows shellcheck warning SC2069 where the redirection
order needs to be fixed. Use { cmd > file; } 2>&1 to fix the
redirection of perf record output

Fixes: 23b97c7ee963 ("perf test: Add test case for record sideband events")
Signed-off-by: Athira Rajeev 
---
 tools/perf/tests/shell/record_sideband.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/record_sideband.sh 
b/tools/perf/tests/shell/record_sideband.sh
index 5024a7ce0c51..7e036763a43c 100755
--- a/tools/perf/tests/shell/record_sideband.sh
+++ b/tools/perf/tests/shell/record_sideband.sh
@@ -22,7 +22,7 @@ trap trap_cleanup EXIT TERM INT
 
 can_cpu_wide()
 {
-if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 
>/dev/null
+if ! { perf record -o ${perfdata} -BN --no-bpf-event -C $1 true > 
/dev/null; } 2>&1
 then
 echo "record sideband test [Skipped cannot record cpu$1]"
 err=2
-- 
2.31.1



[PATCH 2/3] tools/perf/tests Ignore the shellcheck SC2046 warning in lock_contentio

2023-09-28 Thread Athira Rajeev
Running shellcheck on lock_contention.sh generates below
warning

In tests/shell/lock_contention.sh line 36:
   if [ `nproc` -lt 4 ]; then
  ^-^ SC2046: Quote this to prevent word splitting.

Here since nproc will generate a single word output
and there is no possibility of word splitting, this
warning can be ignored. Use exception for this with
"disable" option in shellcheck. This warning is observed
after commit:
"commit 29441ab3a30a ("perf test lock_contention.sh: Skip test
if not enough CPUs")"

Fixes: 29441ab3a30a ("perf test lock_contention.sh: Skip test if not enough 
CPUs")
Signed-off-by: Athira Rajeev 
---
 tools/perf/tests/shell/lock_contention.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/tests/shell/lock_contention.sh 
b/tools/perf/tests/shell/lock_contention.sh
index d5a191d3d090..c1ec5762215b 100755
--- a/tools/perf/tests/shell/lock_contention.sh
+++ b/tools/perf/tests/shell/lock_contention.sh
@@ -33,6 +33,7 @@ check() {
exit
fi
 
+   # shellcheck disable=SC2046
if [ `nproc` -lt 4 ]; then
echo "[Skip] Low number of CPUs (`nproc`), lock event cannot be 
triggered certainly"
err=2
-- 
2.31.1



[PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh

2023-09-28 Thread Athira Rajeev
Running shellcheck on tests/shell/test_arm_coresight.sh
throws below warnings:

In tests/shell/test_arm_coresight.sh line 15:
cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
-print -quit)
  ^--^ SC2061: Quote the parameter to -name so the shell won't 
interpret it.

In tests/shell/test_arm_coresight.sh line 20:
if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = 
"0xA13" ] ; then
 ^-- SC2166: Prefer [ p ] && [ q ] as [ p 
-a q ] is not well defined

This warning is observed after commit:
"commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")"

Fixed this issue by using quoting 'cpu*' for SC2061 and
using "&&" in line number 20 for SC2166 warning

Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")
Signed-off-by: Athira Rajeev 
---
 tools/perf/tests/shell/test_arm_coresight.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/shell/test_arm_coresight.sh 
b/tools/perf/tests/shell/test_arm_coresight.sh
index fe78c4626e45..f2115dfa24a5 100755
--- a/tools/perf/tests/shell/test_arm_coresight.sh
+++ b/tools/perf/tests/shell/test_arm_coresight.sh
@@ -12,12 +12,12 @@
 glb_err=0
 
 cs_etm_dev_name() {
-   cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name cpu* 
-print -quit)
+   cs_etm_path=$(find  /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' 
-print -quit)
trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch)
archhver=$((($trcdevarch >> 12) & 0xf))
archpart=$(($trcdevarch & 0xfff))
 
-   if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; 
then
+   if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = "0xA13" ] 
; then
echo "ete"
else
echo "etm"
-- 
2.31.1



[PATCH 0/3] Fix for shellcheck issues with latest scripts in tests/shell

2023-09-28 Thread Athira Rajeev
shellcheck was run on perf tool shell scripts as a pre-requisite
to include a build option for shellcheck discussed here:
https://www.spinics.net/lists/linux-perf-users/msg25553.html

And fixes were added for the coding/formatting issues in
two patchsets:
https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/

Three additional issues were observed and fixes are part of:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next

With recent commits in perf, other three issues are observed.
shellcheck version: 0.6.0

With this patchset:

for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning 
$F; done
echo $?
0

The changes are with recent commits ( which is mentioned in each patch)
for ock_contention, record_sideband and test_arm_coresight testcases.
The changes are made on top of:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next

Athira Rajeev (3):
  perf tests test_arm_coresight: Fix the shellcheck warning in latest
test_arm_coresight.sh
  tools/perf/tests Ignore the shellcheck SC2046 warning in
lock_contentio
  tools/perf/tests: Fix shellcheck warning in record_sideband.sh test

 tools/perf/tests/shell/lock_contention.sh| 1 +
 tools/perf/tests/shell/record_sideband.sh| 2 +-
 tools/perf/tests/shell/test_arm_coresight.sh | 4 ++--
 3 files changed, 4 insertions(+), 3 deletions(-)

-- 
2.31.1



[PATCH V5 3/3] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-28 Thread Athira Rajeev
The testcase "Object code reading" fails in somecases
for "fs_something" sub test as below:

Reading object code for memory address: 0xc00807f0142c
File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
On file address is: 0x1114cc
Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
objdump read too few bytes: 128
test child finished with -1

This can alo be reproduced when running perf record with
workload that exercises fs_something() code. In the test
setup, this is exercising xfs code since root is xfs.

# perf record ./a.out
# perf report -v |grep "xfs.ko"
  0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807de5efc B [k] xlog_cil_commit
  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807d5ae18 B [k] xfs_btree_key_offset
  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807e11fd4 B [k] 0x00112074

Here addr "0xc00807e11fd4" is not resolved. since this is a
kernel module, its offset is from the DSO. Xfs module is loaded
at 0xc00807d0

   # cat /proc/modules | grep xfs
xfs 2228224 3 - Live 0xc00807d0

And size is 0x22. So its loaded between  0xc00807d0
and 0xc00807f2. From objdump, text section is:
text 0010f7bc    00a0 2**4

Hence perf captured ip maps to 0x112074 which is:
( ip - start of module ) + a0

This offset 0x112074 falls out .text section which is up to 0x10f7bc
In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To address this issue in "object code reading", skip the sample if
address falls out of text section and is within the module end.
Use the "text_end" member of "struct dso" to do this check.

To address this issue in "perf report", exploring an option of
having stubs range as part of the /proc/kallsyms, so that perf
report can resolve addresses in stubs range

However this patch uses text_end to skip the stub range for
Object code reading testcase.

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Tested-by: Disha Goel
Reviewed-by: Adrian Hunter 
Reviewed-by: Kajol Jain 
---
Changelog:
 v4 -> v5:
 Used dso->is_kmod to check if the dso is a kernel module

 v3 -> v4:
 Fixed indent in V3

 v2 -> v3:
 Used strtailcmp in comparison for module check and added Reviewed-by
 from Adrian, Tested-by from Disha.

 v1 -> v2:
 Updated comment to add description on which arch has stub and
 reason for skipping as suggested by Adrian

 tools/perf/tests/code-reading.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index ed3815163d1b..3af81012014e 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
cpumode,
if (addr + len > map__end(al.map))
len = map__end(al.map) - addr;
 
+   /*
+* Some architectures (ex: powerpc) have stubs (trampolines) in kernel
+* modules to manage long jumps. Check if the ip offset falls in stubs
+* sections for kernel modules. And skip module address after text end
+*/
+   if (dso->is_kmod && al.addr > dso->text_end) {
+   pr_debug("skipping the module address %#"PRIx64" after text 
end\n", al.addr);
+   goto out;
+   }
+
/* Read the object code using perf */
ret_len = dso__data_read_offset(dso, 
maps__machine(thread__maps(thread)),
al.addr, buf1, len);
-- 
2.31.1



[PATCH V5 2/3] tools/perf: Add "is_kmod" to struct dso to check if it is kernel module

2023-09-28 Thread Athira Rajeev
Update "struct dso" to include new member "is_kmod".
This new field will determine if the file is a kernel
module or not.

To resolve the address from a sample, perf looks at the
DSO maps. In case of address from a kernel module, there
were some address found to be not resolved. This was
observed while running perf test for "Object code reading".
Though the ip falls beteen the start address of the loaded
module (perf map->start ) and end address ( perf map->end),
it was unresolved.

This was happening because in some cases for kernel
modules, address from sample points to stub instructions.
To identify if the DSO is a kernel module, the new field
"is_kmod" is added to "struct dso".

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
---
Changelog:
 v5:
 This patch adds is_kmod field to dso to detect if
 the dso is a kernel module

 tools/perf/util/dso.c | 2 ++
 tools/perf/util/dso.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index bdfead36b83a..1f629b6fb7cf 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -477,6 +477,7 @@ void dso__set_module_info(struct dso *dso, struct kmod_path 
*m,
dso->comp = m->comp;
}
 
+   dso->is_kmod = 1;
dso__set_short_name(dso, strdup(m->name), true);
 }
 
@@ -1338,6 +1339,7 @@ struct dso *dso__new_id(const char *name, struct dso_id 
*id)
dso->has_srcline = 1;
dso->a2l_fails = 1;
dso->kernel = DSO_SPACE__USER;
+   dso->is_kmod = 0;
dso->needs_swap = DSO_SWAP__UNSET;
dso->comp = COMP_ID__NONE;
RB_CLEAR_NODE(>rb_node);
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 70fe0fe69bef..3759de8c2267 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -162,6 +162,7 @@ struct dso {
char *symsrc_filename;
unsigned int a2l_fails;
enum dso_space_type kernel;
+   boolis_kmod;
enum dso_swap_type  needs_swap;
enum dso_binary_typesymtab_type;
enum dso_binary_typebinary_type;
-- 
2.31.1



[PATCH V5 1/3] tools/perf: Add text_end to "struct dso" to save .text section size

2023-09-28 Thread Athira Rajeev
Update "struct dso" to include new member "text_end".
This new field will represent the offset for end of text
section for a dso. For elf, this value is derived as:
sh_size (Size of section in byes) + sh_offset (Section file
offst) of the elf header for text.

For bfd, this value is derived as:
1. For PE file,
section->size + ( section->vma - dso->text_offset)
2. Other cases:
section->filepos (file position) + section->size (size of
section)

To resolve the address from a sample, perf looks at the
DSO maps. In case of address from a kernel module, there
were some address found to be not resolved. This was
observed while running perf test for "Object code reading".
Though the ip falls beteen the start address of the loaded
module (perf map->start ) and end address ( perf map->end),
it was unresolved.

Example:

Reading object code for memory address: 0xc00807f0142c
File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
On file address is: 0x1114cc
Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
objdump read too few bytes: 128
test child finished with -1

Here, module is loaded at:
# cat /proc/modules | grep xfs
xfs 2228224 3 - Live 0xc00807d0

>From objdump for xfs module, text section is:
text 0010f7bc    00a0 2**4

Here the offset for 0xc00807f0142c ie  0x112074 falls out
.text section which is up to 0x10f7bc.

In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To identify such  address, which falls out of text
section and within module end, added the new field "text_end" to
"struct dso".

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Reviewed-by: Adrian Hunter 
Reviewed-by: Kajol Jain 
---
Changelog:
v2 -> v3:
 Added Reviewed-by from Adrian

 v1 -> v2:
 Added text_end for bfd also by updating dso__load_bfd_symbols
 as suggested by Adrian.

 tools/perf/util/dso.h| 1 +
 tools/perf/util/symbol-elf.c | 4 +++-
 tools/perf/util/symbol.c | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index b41c9782c754..70fe0fe69bef 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -181,6 +181,7 @@ struct dso {
u8   rel;
struct build_id  bid;
u64  text_offset;
+   u64  text_end;
const char   *short_name;
const char   *long_name;
u16  long_name_len;
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 95e99c332d7e..9e7eeaf616b8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, 
struct symsrc *syms_ss,
}
 
if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, ,
-   ".text", NULL))
+   ".text", NULL)) {
dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
+   dso->text_end = tshdr.sh_offset + tshdr.sh_size;
+   }
 
if (runtime_ss->opdsec)
opddata = elf_rawdata(runtime_ss->opdsec, NULL);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 3f36675b7c8f..f25e4e62cf25 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char 
*debugfile)
/* PE symbols can only have 4 bytes, so use .text high 
bits */
dso->text_offset = section->vma - (u32)section->vma;
dso->text_offset += (u32)bfd_asymbol_value(symbols[i]);
+   dso->text_end = (section->vma - dso->text_offset) + 
section->size;
} else {
dso->text_offset = section->vma - section->filepos;
+   dso->text_end = section->filepos + section->size;
}
}
 
-- 
2.31.1



[PATCH V3] perf test: Fix parse-events tests to skip parametrized events

2023-09-27 Thread Athira Rajeev
Testcase "Parsing of all PMU events from sysfs" parse events for
all PMUs, and not just cpu. In case of powerpc, the PowerVM
environment supports events from hv_24x7 and hv_gpci PMU which
is of example format like below:

- hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
- hv_gpci/event,partition_id=?/

The value for "?" needs to be filled in depending on system
configuration. It is better to skip these parametrized events
in this test as it is done in:
'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
parametrized events")' which handled a simialr instance with
"all PMU test".

Fix parse-events test to skip parametrized events since
it needs proper setup of the parameters.

Signed-off-by: Athira Rajeev 
Tested-by: Ian Rogers 
Tested-by: Sachin Sant 
Reviewed-by: Kajol Jain 
---
Changelog:
v2 -> v3:
 Addressed review comments from Namhyung by closing the
 file if getline fails.

v1 -> v2:
 Addressed review comments from Ian. Updated size of
 pmu event name variable and changed bool name which is
 used to skip the test.

 tools/perf/tests/parse-events.c | 39 +
 1 file changed, 39 insertions(+)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index d47f1f871164..2b66ffba3bb0 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -2514,9 +2514,14 @@ static int test__pmu_events(struct test_suite *test 
__maybe_unused, int subtest
while ((pmu = perf_pmus__scan(pmu)) != NULL) {
struct stat st;
char path[PATH_MAX];
+   char pmu_event[PATH_MAX];
+   char *buf = NULL;
+   FILE *file;
struct dirent *ent;
+   size_t len = 0;
DIR *dir;
int err;
+   int n;
 
snprintf(path, PATH_MAX, 
"%s/bus/event_source/devices/%s/events/",
sysfs__mountpoint(), pmu->name);
@@ -2538,11 +2543,45 @@ static int test__pmu_events(struct test_suite *test 
__maybe_unused, int subtest
struct evlist_test e = { .name = NULL, };
char name[2 * NAME_MAX + 1 + 12 + 3];
int test_ret;
+   bool is_event_parameterized = 0;
 
/* Names containing . are special and cannot be used 
directly */
if (strchr(ent->d_name, '.'))
continue;
 
+   /* exclude parametrized ones (name contains '?') */
+   n = snprintf(pmu_event, sizeof(pmu_event), "%s%s", 
path, ent->d_name);
+   if (n >= PATH_MAX) {
+   pr_err("pmu event name crossed PATH_MAX(%d) 
size\n", PATH_MAX);
+   continue;
+   }
+
+   file = fopen(pmu_event, "r");
+   if (!file) {
+   pr_debug("can't open pmu event file for 
'%s'\n", ent->d_name);
+   ret = combine_test_results(ret, TEST_FAIL);
+   continue;
+   }
+
+   if (getline(, , file) < 0) {
+   pr_debug(" pmu event: %s is a null event\n", 
ent->d_name);
+   ret = combine_test_results(ret, TEST_FAIL);
+   fclose(file);
+   continue;
+   }
+
+   if (strchr(buf, '?'))
+   is_event_parameterized = 1;
+
+   free(buf);
+   buf = NULL;
+   fclose(file);
+
+   if (is_event_parameterized == 1) {
+   pr_debug("skipping parametrized PMU event: %s 
which contains ?\n", pmu_event);
+   continue;
+   }
+
snprintf(name, sizeof(name), "%s/event=%s/u", 
pmu->name, ent->d_name);
 
e.name  = name;
-- 
2.31.1



Re: [PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-27 Thread Athira Rajeev



> On 27-Sep-2023, at 5:45 AM, Namhyung Kim  wrote:
> 
> On Thu, Sep 14, 2023 at 10:40 PM Athira Rajeev
>  wrote:
>> 
>> The testcase "Object code reading" fails in somecases
>> for "fs_something" sub test as below:
>> 
>>Reading object code for memory address: 0xc00807f0142c
>>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>On file address is: 0x1114cc
>>Objdump command is: objdump -z -d --start-address=0x11142c 
>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>objdump read too few bytes: 128
>>test child finished with -1
>> 
>> This can alo be reproduced when running perf record with
>> workload that exercises fs_something() code. In the test
>> setup, this is exercising xfs code since root is xfs.
>> 
>># perf record ./a.out
>># perf report -v |grep "xfs.ko"
>>  0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807de5efc B [k] xlog_cil_commit
>>  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807d5ae18 B [k] xfs_btree_key_offset
>>  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807e11fd4 B [k] 0x00112074
>> 
>> Here addr "0xc00807e11fd4" is not resolved. since this is a
>> kernel module, its offset is from the DSO. Xfs module is loaded
>> at 0xc00807d0
>> 
>>   # cat /proc/modules | grep xfs
>>xfs 2228224 3 - Live 0xc00807d0
>> 
>> And size is 0x22. So its loaded between  0xc00807d0
>> and 0xc00807f2. From objdump, text section is:
>>text 0010f7bc    00a0 2**4
>> 
>> Hence perf captured ip maps to 0x112074 which is:
>> ( ip - start of module ) + a0
>> 
>> This offset 0x112074 falls out .text section which is up to 0x10f7bc
>> In this case for module, the address 0xc00807e11fd4 is pointing
>> to stub instructions. This address range represents the module stubs
>> which is allocated on module load and hence is not part of DSO offset.
>> 
>> To address this issue in "object code reading", skip the sample if
>> address falls out of text section and is within the module end.
>> Use the "text_end" member of "struct dso" to do this check.
>> 
>> To address this issue in "perf report", exploring an option of
>> having stubs range as part of the /proc/kallsyms, so that perf
>> report can resolve addresses in stubs range
>> 
>> However this patch uses text_end to skip the stub range for
>> Object code reading testcase.
>> 
>> Reported-by: Disha Goel 
>> Signed-off-by: Athira Rajeev 
>> Tested-by: Disha Goel
>> Reviewed-by: Adrian Hunter 
>> ---
>> Changelog:
>> v3 -> v4:
>> Fixed indent in V3
>> 
>> v2 -> v3:
>> Used strtailcmp in comparison for module check and added Reviewed-by
>> from Adrian, Tested-by from Disha.
>> 
>> v1 -> v2:
>> Updated comment to add description on which arch has stub and
>> reason for skipping as suggested by Adrian
>> 
>> tools/perf/tests/code-reading.c | 10 ++
>> 1 file changed, 10 insertions(+)
>> 
>> diff --git a/tools/perf/tests/code-reading.c 
>> b/tools/perf/tests/code-reading.c
>> index ed3815163d1b..9e6e6c985840 100644
>> --- a/tools/perf/tests/code-reading.c
>> +++ b/tools/perf/tests/code-reading.c
>> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
>> cpumode,
>>if (addr + len > map__end(al.map))
>>len = map__end(al.map) - addr;
>> 
>> +   /*
>> +* Some architectures (ex: powerpc) have stubs (trampolines) in 
>> kernel
>> +* modules to manage long jumps. Check if the ip offset falls in 
>> stubs
>> +* sections for kernel modules. And skip module address after text 
>> end
>> +*/
>> +   if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) {
> 
> There's a is_kernel_module() that can check compressed modules
> too but I think we need a simpler way to check it like dso->kernel.
> 
> Thanks,
> Namhyung

Thanks for the comment Namhyung. I will add similar to dso->kernel, another 
field check in next version of patchset

Athira
> 
> 
>> +   pr_debug("skipping the module address %#"PRIx64" after text 
>> end\n", al.addr);
>> +   goto out;
>> +   }
>> +
>>/* Read the object code using perf */
>>ret_len = dso__data_read_offset(dso, 
>> maps__machine(thread__maps(thread)),
>>al.addr, buf1, len);
>> --
>> 2.31.1




Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events

2023-09-26 Thread Athira Rajeev



> On 27-Sep-2023, at 4:07 AM, Namhyung Kim  wrote:
> 
> Hello,
> 
> On Mon, Sep 25, 2023 at 10:37 AM Arnaldo Carvalho de Melo
>  wrote:
>> 
>> 
>> 
>> On Wed, Sep 13, 2023, 7:40 AM Athira Rajeev  
>> wrote:
>>> 
>>> 
>>> 
>>>> On 08-Sep-2023, at 7:48 PM, Athira Rajeev  
>>>> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On 08-Sep-2023, at 11:04 AM, Sachin Sant  wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 07-Sep-2023, at 10:29 PM, Athira Rajeev  
>>>>>> wrote:
>>>>>> 
>>>>>> Testcase "Parsing of all PMU events from sysfs" parse events for
>>>>>> all PMUs, and not just cpu. In case of powerpc, the PowerVM
>>>>>> environment supports events from hv_24x7 and hv_gpci PMU which
>>>>>> is of example format like below:
>>>>>> 
>>>>>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
>>>>>> - hv_gpci/event,partition_id=?/
>>>>>> 
>>>>>> The value for "?" needs to be filled in depending on system
>>>>>> configuration. It is better to skip these parametrized events
>>>>>> in this test as it is done in:
>>>>>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
>>>>>> parametrized events")' which handled a simialr instance with
>>>>>> "all PMU test".
>>>>>> 
>>>>>> Fix parse-events test to skip parametrized events since
>>>>>> it needs proper setup of the parameters.
>>>>>> 
>>>>>> Signed-off-by: Athira Rajeev 
>>>>>> ---
>>>>>> Changelog:
>>>>>> v1 -> v2:
>>>>>> Addressed review comments from Ian. Updated size of
>>>>>> pmu event name variable and changed bool name which is
>>>>>> used to skip the test.
>>>>>> 
>>>>> 
>>>>> The patch fixes the reported issue.
>>>>> 
>>>>> 6.2: Parsing of all PMU events from sysfs  : Ok
>>>>> 6.3: Parsing of given PMU events from sysfs: Ok
>>>>> 
>>>>> Tested-by: Sachin Sant 
>>>>> 
>>>>> - Sachin
>>>> 
>>>> Hi Sachin, Ian
>>>> 
>>>> Thanks for testing the patch
>>> 
>>> Hi Arnaldo
>>> 
>>> Can you please check and pull this if it looks good to go .
>> 
>> 
>> Namhyung, can you please take a look?
> 
> Yep sure.  I think it needs to close the file when getline() fails.
> 
> Athira, can you please send v3 with that?

Sure, I will post V3 with this change

Athira
> 
> Thanks,
> Namhyung



Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-26 Thread Athira Rajeev



> On 25-Sep-2023, at 1:34 PM, kajoljain  wrote:
> 
> 
> 
> On 9/7/23 22:45, Athira Rajeev wrote:
>> From: root 
>> 
>> shellcheck was run on perf tool shell scripts s a pre-requisite
>> to include a build option for shellcheck discussed here:
>> https://www.spinics.net/lists/linux-perf-users/msg25553.html
>> 
>> And fixes were added for the coding/formatting issues in
>> two patchsets:
>> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
>> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/
>> 
>> Three additional issues are observed with shellcheck "0.6" and
>> this patchset covers those. With this patchset,
>> 
>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>> warning $F; done
>> # echo $?
>> 0
>> 
> 
> Patchset looks good to me.
> 
> Reviewed-by: Kajol Jain 
> 
> Thanks,
> Kajol Jain
> 

Hi Namhyunbg,

Can you please check for this patchset also

Thanks
Athira

>> Athira Rajeev (3):
>>  tests/shell: Fix shellcheck SC1090 to handle the location of sourced
>>files
>>  tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh
>>tetscase
>>  tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts
>> 
>> tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 
>> tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 
>> tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 
>> tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 4 
>> tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 
>> tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
>> tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
>> tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
>> tools/perf/tests/shell/record.sh | 1 +
>> tools/perf/tests/shell/stat+csv_output.sh| 1 +
>> tools/perf/tests/shell/stat+csv_summary.sh   | 4 ++--
>> tools/perf/tests/shell/stat+shadow_stat.sh   | 4 ++--
>> tools/perf/tests/shell/stat+std_output.sh| 1 +
>> tools/perf/tests/shell/test_intel_pt.sh  | 1 +
>> tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
>> 15 files changed, 35 insertions(+), 4 deletions(-)




Re: [PATCH 2/2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-09-26 Thread Athira Rajeev



> On 27-Sep-2023, at 5:25 AM, Namhyung Kim  wrote:
> 
> On Thu, Sep 14, 2023 at 10:18 AM Athira Rajeev
>  wrote:
>> 
>> Add rule in new Makefile "tests/Makefile.tests" for running
>> shellcheck on shell test scripts. This automates below shellcheck
>> into the build.
>> 
>>$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do 
>> shellcheck -S warning $F; done
> 
> I think you can do it if $(shell command -v shellcheck) returns
> non-empty string (the path to the shellcheck).  Then the feature
> test logic can be gone.

Ok, I will try this.
> 
>> 
>> CONFIG_SHELLCHECK check is added to avoid build breakage in
>> the absence of shellcheck binary. Update Makefile.perf to contain
>> new rule for "SHELLCHECK_TEST" which is for making shellcheck
>> test as a dependency on perf binary. Added "tests/Makefile.tests"
>> to run shellcheck on shellscripts in tests/shell. The make rule
>> "SHLLCHECK_RUN" ensures that, every time during make, shellcheck
>> will be run only on modified files during subsequent invocations.
>> By this, if any newly added shell scripts or fixes in existing
>> scripts breaks coding/formatting style, it will get captured
>> during the perf build.
> 
> Can you show me the example output?

Sure, I will add it.
> 
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>> tools/perf/Makefile.perf| 12 +++-
>> tools/perf/tests/Makefile.tests | 24 
>> 2 files changed, 35 insertions(+), 1 deletion(-)
>> create mode 100644 tools/perf/tests/Makefile.tests
>> 
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index f6fdc2d5a92f..c27f54771e90 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -667,7 +667,16 @@ $(PERF_IN): prepare FORCE
>> $(PMU_EVENTS_IN): FORCE prepare
>>$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events 
>> obj=pmu-events
>> 
>> -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
>> +# Runs shellcheck on perf test shell scripts
>> +ifeq ($(CONFIG_SHELLCHECK),y)
>> +SHELLCHECK_TEST: FORCE prepare
>> +   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests
>> +else
>> +SHELLCHECK_TEST:
>> +   @:
>> +endif
>> +
>> +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST
>>$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \
>>$(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
>> 
>> @@ -1129,6 +1138,7 @@ bpf-skel-clean:
>>$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
>> 
>> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean 
>> $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean 
>> tests-coresight-targets-clean
>> +   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
>>$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) 
>> $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS)
>>    $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' 
>> -delete -o -name '\.*.d' -delete
>>$(Q)$(RM) $(OUTPUT).config-detected
>> diff --git a/tools/perf/tests/Makefile.tests 
>> b/tools/perf/tests/Makefile.tests
>> new file mode 100644
>> index ..e74575559e83
>> --- /dev/null
>> +++ b/tools/perf/tests/Makefile.tests
>> @@ -0,0 +1,24 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Athira Rajeev , 2023
>> +-include $(OUTPUT).config-detected
>> +
>> +log_file = $(OUTPUT)shellcheck_test.log
>> +PROGS = $(subst ./,,$(shell find tests/shell -perm -o=x -type f -name 
>> '*.sh'))
>> +DEPS = $(addprefix output/,$(addsuffix .dep,$(basename $(PROGS
>> +DIRS = $(shell echo $(dir $(DEPS)) | xargs -n1 | sort -u | xargs)
>> +
>> +.PHONY: all
>> +all: SHELLCHECK_RUN
>> +   @:
>> +
>> +SHELLCHECK_RUN: $(DEPS) $(DIRS)
>> +
>> +output/%.dep: %.sh | $(DIRS)
>> +   $(call rule_mkdir)
>> +   $(Q)$(call frecho-cmd,test)@touch $@
>> +   $(Q)$(call frecho-cmd,test)@shellcheck -S warning $(subst 
>> output/,./,$(patsubst %.dep, %.sh, $@)) 1> ${log_file} && ([[ ! -s 
>> ${log_file} ]])
> 
> This line is too long, please wrap it with some backslashes.
Ok

I will address all the comments in next version

Thanks
Athira
> 
> Thanks,
> Namhyung
> 
> 
>> +$(DIRS):
>> +   @mkdir -p $@
>> +
>> +clean:
>> +   @rm -rf $(log_file) output
>> --
>> 2.31.1




Re: [PATCH 1/2] tools/perf: Add new CONFIG_SHELLCHECK for detecting shellcheck binary

2023-09-26 Thread Athira Rajeev



> On 27-Sep-2023, at 5:21 AM, Namhyung Kim  wrote:
> 
> Hello,
> 
> On Thu, Sep 14, 2023 at 10:18 AM Athira Rajeev
>  wrote:
>> 
>> shellcheck tool can detect coding/formatting issues on
>> shell scripts. In perf directory "tests/shell", there are lot
>> of shell test scripts and this tool can detect coding/formatting
>> issues on these scripts.
>> 
>> Example to use shellcheck for severity level for
>> errors and warnings, below command is used:
>> 
>>   # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>> warning $F; done
>>   # echo $?
>> 0
>> 
>> This testing needs to be automated into the build so that it
>> can avoid regressions and also run the check for newly added
>> during build test itself. Add a new feature check to detect
>> presence of shellcheck. Add CONFIG_SHELLCHECK feature check in
>> the build to avoid not having shellcheck breaking the build.
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>> tools/build/Makefile.feature |  6 --
>> tools/build/feature/Makefile |  8 +++-
>> tools/perf/Makefile.config   | 10 ++
>> 3 files changed, 21 insertions(+), 3 deletions(-)
>> 
>> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
>> index 934e2777a2db..23f56b95babf 100644
>> --- a/tools/build/Makefile.feature
>> +++ b/tools/build/Makefile.feature
>> @@ -72,7 +72,8 @@ FEATURE_TESTS_BASIC :=  \
>> libzstd\
>> disassembler-four-args \
>> disassembler-init-styled   \
>> -file-handle
>> +file-handle\
>> +shellcheck
>> 
>> # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
>> # of all feature tests
>> @@ -138,7 +139,8 @@ FEATURE_DISPLAY ?=  \
>>  get_cpuid  \
>>  bpf   \
>>  libaio\
>> - libzstd
>> + libzstd   \
>> + shellcheck
>> 
>> #
>> # Declare group members of a feature to display the logical OR of the 
>> detection
>> diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
>> index 3184f387990a..44ba6d0c98d0 100644
>> --- a/tools/build/feature/Makefile
>> +++ b/tools/build/feature/Makefile
>> @@ -76,7 +76,8 @@ FILES=  \
>>  test-libzstd.bin  \
>>  test-clang-bpf-co-re.bin  \
>>  test-file-handle.bin  \
>> - test-libpfm4.bin
>> + test-libpfm4.bin  \
>> + test-shellcheck.bin
>> 
>> FILES := $(addprefix $(OUTPUT),$(FILES))
>> 
>> @@ -92,6 +93,8 @@ __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ 
>> $(patsubst %.bin,%.c,$(@F)) $(
>> __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst 
>> %.bin,%.cpp,$(@F)) $(LDFLAGS)
>>   BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1
>> 
>> +  BUILD_BINARY = sh -c $1 > $(@:.bin=.make.output) 2>&1
>> +
>> ###
>> 
>> $(OUTPUT)test-all.bin:
>> @@ -207,6 +210,9 @@ $(OUTPUT)test-libslang-include-subdir.bin:
>> $(OUTPUT)test-libtraceevent.bin:
>>$(BUILD) -ltraceevent
>> 
>> +$(OUTPUT)test-shellcheck.bin:
>> +   $(BUILD_BINARY) "shellcheck --version"
> 
> I don't think it'd generate the .bin file.
> 
> Anyway, it's a binary file already.  Can we check it with
> `command -v` and get rid of the feature test?
> 
> Thanks,
> Namhyung

Hi Namhyung,

Thanks for the review. Sure, I will check on this

Athira
> 
> 
>> +
>> $(OUTPUT)test-libtracefs.bin:
>> $(BUILD) $(shell $(PKG_CONFIG) --cflags libtraceevent 2>/dev/null) 
>> -ltracefs
>> 
>> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
>> index d66b52407e19..e71fe95ad865 100644
>> --- a/tools/perf/Makefile.config
>> +++ b/tools/perf/Makefile.config
>> @@ -779,6 +779,16 @@ ifndef NO_SLANG
>>   endif
>> endif
>> 
>> +ifneq ($(NO_SHELLCHECK),1)
>> +  $(call feature_check,shellcheck)
>> +  ifneq ($(feature-shellcheck), 1)
>> +msg := $(warning No shellcheck found. please install ShellCheck);
>> +  else
>> +$(call detected,CONFIG_SHELLCHECK)
>> +NO_SHELLCHECK := 0
>> +  endif
>> +endif
>> +
>> ifdef GTK2
>>   FLAGS_GTK2=$(CFLAGS) $(LDFLAGS) $(EXTLIBS) $(shell $(PKG_CONFIG) --libs 
>> --cflags gtk+-2.0 2>/dev/null)
>>   $(call feature_check,gtk2)
>> --
>> 2.31.1




Re: [PATCH 1/3] core/device: Add function to return child node using name at substring "@"

2023-09-24 Thread Athira Rajeev



> On 18-Sep-2023, at 7:42 PM, Reza Arbab  wrote:
> 
> On Thu, Sep 14, 2023 at 10:02:04PM +0530, Athira Rajeev wrote:
>> Add a function dt_find_by_name_before_addr() that returns the child node if
>> it matches till first occurrence at "@" of a given name, otherwise NULL.
>> This is helpful for cases with node name like: "name@addr". In
>> scenarios where nodes are added with "name@addr" format and if the
>> value of "addr" is not known, that node can't be matched with node
>> name or addr. Hence matching with substring as node name will return
>> the expected result. Patch adds dt_find_by_name_before_addr() function
>> and testcase for the same in core/test/run-device.c
> 
> Series applied to skiboot master with the fixup we discussed.
> 
> -- 
> Reza Arbab

Thanks Reza for picking up the patchset

Athira



Re: [PATCH 1/3] core/device: Add function to return child node using name at substring "@"

2023-09-15 Thread Athira Rajeev



> On 15-Sep-2023, at 8:00 PM, Reza Arbab  wrote:
> 
> Hi Athira,
> 
> On Thu, Sep 14, 2023 at 10:02:04PM +0530, Athira Rajeev wrote: 
>> +struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const 
>> char *name)
>> +{
>> + struct dt_node *child, *match;
>> + char *child_node = NULL;
>> +
>> + list_for_each(>children, child, list) {
>> + child_node = strdup(child->name);
>> + if (!child_node)
>> + goto err;
>> + child_node = strtok(child_node, "@");
>> + if (!strcmp(child_node, name)) {
>> + free(child_node);
>> + return child;
>> + }
>> +
>> + match = dt_find_by_name_before_addr(child, name);
>> + if (match)
>> + return match;
> 
> When the function returns on this line, child_node is not freed.
> 
>> + }
>> +
>> + free(child_node);
>> +err:
>> + return NULL;
>> +}
> 
> I took at stab at moving free(child_node) inside the loop, and ended up with 
> this:
> 
> struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const char 
> *name)
> {
> struct dt_node *child, *match = NULL;
> char *child_name = NULL;
> 
> list_for_each(>children, child, list) {
> child_name = strdup(child->name);
> if (!child_name)
> return NULL;
> 
> child_name = strtok(child_name, "@");
> if (!strcmp(child_name, name))
> match = child;
> else
> match = dt_find_by_name_before_addr(child, name);
> 
> free(child_name);
> if (match)
> return match;
> }
> 
> return NULL;
> }
> 
> Does this seem okay to you? If you agree, no need to send another revision, I 
> can just fixup during commit. Let me know.

Hi Reza,

Sure, Change looks good. Thanks for the change and fixup.

Thanks
Athira
> 
>> diff --git a/core/test/run-device.c b/core/test/run-device.c
>> index 4a12382bb..fb7a7d2c0 100644
>> --- a/core/test/run-device.c
>> +++ b/core/test/run-device.c
>> @@ -466,6 +466,20 @@ int main(void)
>> new_prop_ph = dt_prop_get_u32(ut2, "something");
>> assert(!(new_prop_ph == ev1_ph));
>> dt_free(subtree);
>> +
>> + /* Test dt_find_by_name_before_addr */
>> + root = dt_new_root("");
>> + addr1 = dt_new_addr(root, "node", 0x1);
>> + addr2 = dt_new_addr(root, "node0_1", 0x2);
>> + assert(dt_find_by_name(root, "node@1") == addr1);
>> + assert(dt_find_by_name(root, "node0_1@2") == addr2);
>> + assert(dt_find_by_name_before_addr(root, "node") == addr1);
>> + assert(dt_find_by_name_before_addr(root, "node0_") == NULL);
> 
> This line appears twice. As above, can fix during commit, so no need for a 
> new patch.
> 
>> + assert(dt_find_by_name_before_addr(root, "node0_1") == addr2);
>> + assert(dt_find_by_name_before_addr(root, "node0") == NULL);
>> + assert(dt_find_by_name_before_addr(root, "node0_") == NULL);
>> + dt_free(root);
>> +
>> return 0;
>> }
>> 
> 
> -- 
> Reza Arbab



Re: [PATCH V3 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-14 Thread Athira Rajeev



> On 15-Sep-2023, at 10:56 AM, Adrian Hunter  wrote:
> 
> On 15/09/23 08:24, Athira Rajeev wrote:
>> The testcase "Object code reading" fails in somecases
>> for "fs_something" sub test as below:
>> 
>>Reading object code for memory address: 0xc00807f0142c
>>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>On file address is: 0x1114cc
>>Objdump command is: objdump -z -d --start-address=0x11142c 
>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>objdump read too few bytes: 128
>>test child finished with -1
>> 
>> This can alo be reproduced when running perf record with
>> workload that exercises fs_something() code. In the test
>> setup, this is exercising xfs code since root is xfs.
>> 
>># perf record ./a.out
>># perf report -v |grep "xfs.ko"
>>  0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807de5efc B [k] xlog_cil_commit
>>  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807d5ae18 B [k] xfs_btree_key_offset
>>  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807e11fd4 B [k] 0x00112074
>> 
>> Here addr "0xc00807e11fd4" is not resolved. since this is a
>> kernel module, its offset is from the DSO. Xfs module is loaded
>> at 0xc00807d0
>> 
>>   # cat /proc/modules | grep xfs
>>xfs 2228224 3 - Live 0xc00807d0
>> 
>> And size is 0x22. So its loaded between  0xc00807d0
>> and 0xc00807f2. From objdump, text section is:
>>text 0010f7bc    00a0 2**4
>> 
>> Hence perf captured ip maps to 0x112074 which is:
>> ( ip - start of module ) + a0
>> 
>> This offset 0x112074 falls out .text section which is up to 0x10f7bc
>> In this case for module, the address 0xc00807e11fd4 is pointing
>> to stub instructions. This address range represents the module stubs
>> which is allocated on module load and hence is not part of DSO offset.
>> 
>> To address this issue in "object code reading", skip the sample if
>> address falls out of text section and is within the module end.
>> Use the "text_end" member of "struct dso" to do this check.
>> 
>> To address this issue in "perf report", exploring an option of
>> having stubs range as part of the /proc/kallsyms, so that perf
>> report can resolve addresses in stubs range
>> 
>> However this patch uses text_end to skip the stub range for
>> Object code reading testcase.
>> 
>> Reported-by: Disha Goel 
>> Signed-off-by: Athira Rajeev 
>> Tested-by: Disha Goel
>> Reviewed-by: Adrian Hunter 
>> ---
>> Changelog:
>> v2 -> v3:
>> Used strtailcmp in comparison for module check and added Reviewed-by
>> from Adrian, Tested-by from Disha.
>> 
>> v1 -> v2:
>> Updated comment to add description on which arch has stub and
>> reason for skipping as suggested by Adrian
>> 
>> tools/perf/tests/code-reading.c | 10 ++
>> 1 file changed, 10 insertions(+)
>> 
>> diff --git a/tools/perf/tests/code-reading.c 
>> b/tools/perf/tests/code-reading.c
>> index ed3815163d1b..45334d26058e 100644
>> --- a/tools/perf/tests/code-reading.c
>> +++ b/tools/perf/tests/code-reading.c
>> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
>> cpumode,
>> if (addr + len > map__end(al.map))
>> len = map__end(al.map) - addr;
>> 
>> + /*
>> +  * Some architectures (ex: powerpc) have stubs (trampolines) in kernel
>> +  * modules to manage long jumps. Check if the ip offset falls in stubs
>> +  * sections for kernel modules. And skip module address after text end
>> +  */
>> + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) {
>> + pr_debug("skipping the module address %#"PRIx64" after text end\n", 
>> al.addr);
>> + goto out;
> 
> Double indent

My bad, addressed in V4

Athira
> 
>> + }
>> +
>> /* Read the object code using perf */
>> ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)),
>> al.addr, buf1, len);




[PATCH V4 1/2] tools/perf: Add text_end to "struct dso" to save .text section size

2023-09-14 Thread Athira Rajeev
Update "struct dso" to include new member "text_end".
This new field will represent the offset for end of text
section for a dso. For elf, this value is derived as:
sh_size (Size of section in byes) + sh_offset (Section file
offst) of the elf header for text.

For bfd, this value is derived as:
1. For PE file,
section->size + ( section->vma - dso->text_offset)
2. Other cases:
section->filepos (file position) + section->size (size of
section)

To resolve the address from a sample, perf looks at the
DSO maps. In case of address from a kernel module, there
were some address found to be not resolved. This was
observed while running perf test for "Object code reading".
Though the ip falls beteen the start address of the loaded
module (perf map->start ) and end address ( perf map->end),
it was unresolved.

Example:

Reading object code for memory address: 0xc00807f0142c
File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
On file address is: 0x1114cc
Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
objdump read too few bytes: 128
test child finished with -1

Here, module is loaded at:
# cat /proc/modules | grep xfs
xfs 2228224 3 - Live 0xc00807d0

>From objdump for xfs module, text section is:
text 0010f7bc    00a0 2**4

Here the offset for 0xc00807f0142c ie  0x112074 falls out
.text section which is up to 0x10f7bc.

In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To identify such  address, which falls out of text
section and within module end, added the new field "text_end" to
"struct dso".

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Reviewed-by: Adrian Hunter 
---
Changelog:
v2 -> v3:
 Added Reviewed-by from Adrian

 v1 -> v2:
 Added text_end for bfd also by updating dso__load_bfd_symbols
 as suggested by Adrian.

 tools/perf/util/dso.h| 1 +
 tools/perf/util/symbol-elf.c | 4 +++-
 tools/perf/util/symbol.c | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index b41c9782c754..70fe0fe69bef 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -181,6 +181,7 @@ struct dso {
u8   rel;
struct build_id  bid;
u64  text_offset;
+   u64  text_end;
const char   *short_name;
const char   *long_name;
u16  long_name_len;
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 95e99c332d7e..9e7eeaf616b8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, 
struct symsrc *syms_ss,
}
 
if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, ,
-   ".text", NULL))
+   ".text", NULL)) {
dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
+   dso->text_end = tshdr.sh_offset + tshdr.sh_size;
+   }
 
if (runtime_ss->opdsec)
opddata = elf_rawdata(runtime_ss->opdsec, NULL);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 3f36675b7c8f..f25e4e62cf25 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char 
*debugfile)
/* PE symbols can only have 4 bytes, so use .text high 
bits */
dso->text_offset = section->vma - (u32)section->vma;
dso->text_offset += (u32)bfd_asymbol_value(symbols[i]);
+   dso->text_end = (section->vma - dso->text_offset) + 
section->size;
} else {
dso->text_offset = section->vma - section->filepos;
+   dso->text_end = section->filepos + section->size;
}
}
 
-- 
2.31.1



[PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-14 Thread Athira Rajeev
The testcase "Object code reading" fails in somecases
for "fs_something" sub test as below:

Reading object code for memory address: 0xc00807f0142c
File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
On file address is: 0x1114cc
Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
objdump read too few bytes: 128
test child finished with -1

This can alo be reproduced when running perf record with
workload that exercises fs_something() code. In the test
setup, this is exercising xfs code since root is xfs.

# perf record ./a.out
# perf report -v |grep "xfs.ko"
  0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807de5efc B [k] xlog_cil_commit
  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807d5ae18 B [k] xfs_btree_key_offset
  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807e11fd4 B [k] 0x00112074

Here addr "0xc00807e11fd4" is not resolved. since this is a
kernel module, its offset is from the DSO. Xfs module is loaded
at 0xc00807d0

   # cat /proc/modules | grep xfs
xfs 2228224 3 - Live 0xc00807d0

And size is 0x22. So its loaded between  0xc00807d0
and 0xc00807f2. From objdump, text section is:
text 0010f7bc    00a0 2**4

Hence perf captured ip maps to 0x112074 which is:
( ip - start of module ) + a0

This offset 0x112074 falls out .text section which is up to 0x10f7bc
In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To address this issue in "object code reading", skip the sample if
address falls out of text section and is within the module end.
Use the "text_end" member of "struct dso" to do this check.

To address this issue in "perf report", exploring an option of
having stubs range as part of the /proc/kallsyms, so that perf
report can resolve addresses in stubs range

However this patch uses text_end to skip the stub range for
Object code reading testcase.

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Tested-by: Disha Goel
Reviewed-by: Adrian Hunter 
---
Changelog:
 v3 -> v4:
 Fixed indent in V3

 v2 -> v3:
 Used strtailcmp in comparison for module check and added Reviewed-by
 from Adrian, Tested-by from Disha.

 v1 -> v2:
 Updated comment to add description on which arch has stub and
 reason for skipping as suggested by Adrian

 tools/perf/tests/code-reading.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index ed3815163d1b..9e6e6c985840 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
cpumode,
if (addr + len > map__end(al.map))
len = map__end(al.map) - addr;
 
+   /*
+* Some architectures (ex: powerpc) have stubs (trampolines) in kernel
+* modules to manage long jumps. Check if the ip offset falls in stubs
+* sections for kernel modules. And skip module address after text end
+*/
+   if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) {
+   pr_debug("skipping the module address %#"PRIx64" after text 
end\n", al.addr);
+   goto out;
+   }
+
/* Read the object code using perf */
ret_len = dso__data_read_offset(dso, 
maps__machine(thread__maps(thread)),
al.addr, buf1, len);
-- 
2.31.1



[PATCH V3 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-14 Thread Athira Rajeev
The testcase "Object code reading" fails in somecases
for "fs_something" sub test as below:

Reading object code for memory address: 0xc00807f0142c
File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
On file address is: 0x1114cc
Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
objdump read too few bytes: 128
test child finished with -1

This can alo be reproduced when running perf record with
workload that exercises fs_something() code. In the test
setup, this is exercising xfs code since root is xfs.

# perf record ./a.out
# perf report -v |grep "xfs.ko"
  0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807de5efc B [k] xlog_cil_commit
  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807d5ae18 B [k] xfs_btree_key_offset
  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
0xc00807e11fd4 B [k] 0x00112074

Here addr "0xc00807e11fd4" is not resolved. since this is a
kernel module, its offset is from the DSO. Xfs module is loaded
at 0xc00807d0

   # cat /proc/modules | grep xfs
xfs 2228224 3 - Live 0xc00807d0

And size is 0x22. So its loaded between  0xc00807d0
and 0xc00807f2. From objdump, text section is:
text 0010f7bc    00a0 2**4

Hence perf captured ip maps to 0x112074 which is:
( ip - start of module ) + a0

This offset 0x112074 falls out .text section which is up to 0x10f7bc
In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To address this issue in "object code reading", skip the sample if
address falls out of text section and is within the module end.
Use the "text_end" member of "struct dso" to do this check.

To address this issue in "perf report", exploring an option of
having stubs range as part of the /proc/kallsyms, so that perf
report can resolve addresses in stubs range

However this patch uses text_end to skip the stub range for
Object code reading testcase.

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Tested-by: Disha Goel
Reviewed-by: Adrian Hunter 
---
Changelog:
 v2 -> v3:
 Used strtailcmp in comparison for module check and added Reviewed-by
 from Adrian, Tested-by from Disha.

 v1 -> v2:
 Updated comment to add description on which arch has stub and
 reason for skipping as suggested by Adrian

 tools/perf/tests/code-reading.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index ed3815163d1b..45334d26058e 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
cpumode,
if (addr + len > map__end(al.map))
len = map__end(al.map) - addr;
 
+   /*
+* Some architectures (ex: powerpc) have stubs (trampolines) in kernel
+* modules to manage long jumps. Check if the ip offset falls in stubs
+* sections for kernel modules. And skip module address after text end
+*/
+   if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) {
+   pr_debug("skipping the module address %#"PRIx64" after 
text end\n", al.addr);
+   goto out;
+   }
+
/* Read the object code using perf */
ret_len = dso__data_read_offset(dso, 
maps__machine(thread__maps(thread)),
al.addr, buf1, len);
-- 
2.31.1



[PATCH V3 1/2] tools/perf: Add text_end to "struct dso" to save .text section size

2023-09-14 Thread Athira Rajeev
Update "struct dso" to include new member "text_end".
This new field will represent the offset for end of text
section for a dso. For elf, this value is derived as:
sh_size (Size of section in byes) + sh_offset (Section file
offst) of the elf header for text.

For bfd, this value is derived as:
1. For PE file,
section->size + ( section->vma - dso->text_offset)
2. Other cases:
section->filepos (file position) + section->size (size of
section)

To resolve the address from a sample, perf looks at the
DSO maps. In case of address from a kernel module, there
were some address found to be not resolved. This was
observed while running perf test for "Object code reading".
Though the ip falls beteen the start address of the loaded
module (perf map->start ) and end address ( perf map->end),
it was unresolved.

Example:

Reading object code for memory address: 0xc00807f0142c
File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
On file address is: 0x1114cc
Objdump command is: objdump -z -d --start-address=0x11142c 
--stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
objdump read too few bytes: 128
test child finished with -1

Here, module is loaded at:
# cat /proc/modules | grep xfs
xfs 2228224 3 - Live 0xc00807d0

>From objdump for xfs module, text section is:
text 0010f7bc    00a0 2**4

Here the offset for 0xc00807f0142c ie  0x112074 falls out
.text section which is up to 0x10f7bc.

In this case for module, the address 0xc00807e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To identify such  address, which falls out of text
section and within module end, added the new field "text_end" to
"struct dso".

Reported-by: Disha Goel 
Signed-off-by: Athira Rajeev 
Reviewed-by: Adrian Hunter 
---
Changelog:
v2 -> v3:
 Added Reviewed-by from Adrian

 v1 -> v2:
 Added text_end for bfd also by updating dso__load_bfd_symbols
 as suggested by Adrian.

 tools/perf/util/dso.h| 1 +
 tools/perf/util/symbol-elf.c | 4 +++-
 tools/perf/util/symbol.c | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index b41c9782c754..70fe0fe69bef 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -181,6 +181,7 @@ struct dso {
u8   rel;
struct build_id  bid;
u64  text_offset;
+   u64  text_end;
const char   *short_name;
const char   *long_name;
u16  long_name_len;
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 95e99c332d7e..9e7eeaf616b8 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, 
struct symsrc *syms_ss,
}
 
if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, ,
-   ".text", NULL))
+   ".text", NULL)) {
dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
+   dso->text_end = tshdr.sh_offset + tshdr.sh_size;
+   }
 
if (runtime_ss->opdsec)
opddata = elf_rawdata(runtime_ss->opdsec, NULL);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 3f36675b7c8f..f25e4e62cf25 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char 
*debugfile)
/* PE symbols can only have 4 bytes, so use .text high 
bits */
dso->text_offset = section->vma - (u32)section->vma;
dso->text_offset += (u32)bfd_asymbol_value(symbols[i]);
+   dso->text_end = (section->vma - dso->text_offset) + 
section->size;
} else {
dso->text_offset = section->vma - section->filepos;
+   dso->text_end = section->filepos + section->size;
}
}
 
-- 
2.31.1



Re: [V2 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-14 Thread Athira Rajeev



> On 14-Sep-2023, at 11:54 PM, Adrian Hunter  wrote:
> 
> On 7/09/23 19:45, Athira Rajeev wrote:
>> The testcase "Object code reading" fails in somecases
>> for "fs_something" sub test as below:
>> 
>>Reading object code for memory address: 0xc00807f0142c
>>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>On file address is: 0x1114cc
>>Objdump command is: objdump -z -d --start-address=0x11142c 
>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>objdump read too few bytes: 128
>>test child finished with -1
>> 
>> This can alo be reproduced when running perf record with
>> workload that exercises fs_something() code. In the test
>> setup, this is exercising xfs code since root is xfs.
>> 
>># perf record ./a.out
>># perf report -v |grep "xfs.ko"
>>  0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807de5efc B [k] xlog_cil_commit
>>  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807d5ae18 B [k] xfs_btree_key_offset
>>  0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
>> 0xc00807e11fd4 B [k] 0x00112074
>> 
>> Here addr "0xc00807e11fd4" is not resolved. since this is a
>> kernel module, its offset is from the DSO. Xfs module is loaded
>> at 0xc00807d0
>> 
>>   # cat /proc/modules | grep xfs
>>xfs 2228224 3 - Live 0xc00807d0
>> 
>> And size is 0x22. So its loaded between  0xc00807d0
>> and 0xc00807f2. From objdump, text section is:
>>text 0010f7bc    00a0 2**4
>> 
>> Hence perf captured ip maps to 0x112074 which is:
>> ( ip - start of module ) + a0
>> 
>> This offset 0x112074 falls out .text section which is up to 0x10f7bc
>> In this case for module, the address 0xc00807e11fd4 is pointing
>> to stub instructions. This address range represents the module stubs
>> which is allocated on module load and hence is not part of DSO offset.
>> 
>> To address this issue in "object code reading", skip the sample if
>> address falls out of text section and is within the module end.
>> Use the "text_end" member of "struct dso" to do this check.
>> 
>> To address this issue in "perf report", exploring an option of
>> having stubs range as part of the /proc/kallsyms, so that perf
>> report can resolve addresses in stubs range
>> 
>> However this patch uses text_end to skip the stub range for
>> Object code reading testcase.
>> 
>> Reported-by: Disha Goel 
>> Signed-off-by: Athira Rajeev 
>> ---
>> Changelog:
>> v1 -> v2:
>> Updated comment to add description on which arch has stub and
>> reason for skipping as suggested by Adrian
>> 
>> tools/perf/tests/code-reading.c | 12 
>> 1 file changed, 12 insertions(+)
>> 
>> diff --git a/tools/perf/tests/code-reading.c 
>> b/tools/perf/tests/code-reading.c
>> index ed3815163d1b..3cf6c2d42416 100644
>> --- a/tools/perf/tests/code-reading.c
>> +++ b/tools/perf/tests/code-reading.c
>> @@ -269,6 +269,18 @@ static int read_object_code(u64 addr, size_t len, u8 
>> cpumode,
>> if (addr + len > map__end(al.map))
>> len = map__end(al.map) - addr;
>> 
>> + /*
>> +  * Some architectures (ex: powerpc) have stubs (trampolines) in kernel
>> +  * modules to manage long jumps. Check if the ip offset falls in stubs
>> +  * sections for kernel modules. And skip module address after text end
>> +  */
>> + if (strstr(dso->long_name, ".ko")) {
> 
> Sorry for slow reply
> 
> !strtailcmp() is slightly better here
> 
>> + if (al.addr > dso->text_end) {
> 
> We normally avoid nesting if-statements e.g.
> 
> if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end)
> 
> Make those changes and you can add:
> 
> Reviewed-by: Adrian Hunter 

Sure, will post a V3 with this change

Athira
> 
> 
>> + pr_debug("skipping the module address %#"PRIx64" after text end\n", 
>> al.addr);
>> + goto out;
>> + }
>> + }
>> +
>> /* Read the object code using perf */
>> ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)),
>> al.addr, buf1, len);




Re: [V2 1/2] tools/perf: Add text_end to "struct dso" to save .text section size

2023-09-14 Thread Athira Rajeev



> On 14-Sep-2023, at 11:49 PM, Adrian Hunter  wrote:
> 
> On 7/09/23 19:45, Athira Rajeev wrote:
>> Update "struct dso" to include new member "text_end".
>> This new field will represent the offset for end of text
>> section for a dso. For elf, this value is derived as:
>> sh_size (Size of section in byes) + sh_offset (Section file
>> offst) of the elf header for text.
>> 
>> For bfd, this value is derived as:
>> 1. For PE file,
>> section->size + ( section->vma - dso->text_offset)
>> 2. Other cases:
>> section->filepos (file position) + section->size (size of
>> section)
>> 
>> To resolve the address from a sample, perf looks at the
>> DSO maps. In case of address from a kernel module, there
>> were some address found to be not resolved. This was
>> observed while running perf test for "Object code reading".
>> Though the ip falls beteen the start address of the loaded
>> module (perf map->start ) and end address ( perf map->end),
>> it was unresolved.
>> 
>> Example:
>> 
>>Reading object code for memory address: 0xc00807f0142c
>>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>On file address is: 0x1114cc
>>Objdump command is: objdump -z -d --start-address=0x11142c 
>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>>objdump read too few bytes: 128
>>test child finished with -1
>> 
>> Here, module is loaded at:
>># cat /proc/modules | grep xfs
>>xfs 2228224 3 - Live 0xc00807d0
>> 
>> From objdump for xfs module, text section is:
>>text 0010f7bc    00a0 2**4
>> 
>> Here the offset for 0xc00807f0142c ie  0x112074 falls out
>> .text section which is up to 0x10f7bc.
>> 
>> In this case for module, the address 0xc00807e11fd4 is pointing
>> to stub instructions. This address range represents the module stubs
>> which is allocated on module load and hence is not part of DSO offset.
>> 
>> To identify such  address, which falls out of text
>> section and within module end, added the new field "text_end" to
>> "struct dso".
>> 
>> Reported-by: Disha Goel 
>> Signed-off-by: Athira Rajeev 
> 
> Reviewed-by: Adrian Hunter 

Hi Adrian

Thanks for the review

> 
>> ---
>> Changelog:
>> v1 -> v2:
>> Added text_end for bfd also by updating dso__load_bfd_symbols
>> as suggested by Adrian.
>> 
>> tools/perf/util/dso.h| 1 +
>> tools/perf/util/symbol-elf.c | 4 +++-
>> tools/perf/util/symbol.c | 2 ++
>> 3 files changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index b41c9782c754..70fe0fe69bef 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -181,6 +181,7 @@ struct dso {
>> u8  rel;
>> struct build_id  bid;
>> u64  text_offset;
>> + u64  text_end;
>> const char  *short_name;
>> const char  *long_name;
>> u16  long_name_len;
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 95e99c332d7e..9e7eeaf616b8 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map 
>> *map, struct symsrc *syms_ss,
>> }
>> 
>> if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, ,
>> - ".text", NULL))
>> + ".text", NULL)) {
>> dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
>> + dso->text_end = tshdr.sh_offset + tshdr.sh_size;
>> + }
>> 
>> if (runtime_ss->opdsec)
>> opddata = elf_rawdata(runtime_ss->opdsec, NULL);
>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>> index 3f36675b7c8f..f25e4e62cf25 100644
>> --- a/tools/perf/util/symbol.c
>> +++ b/tools/perf/util/symbol.c
>> @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char 
>> *debugfile)
>> /* PE symbols can only have 4 bytes, so use .text high bits */
>> dso->text_offset = section->vma - (u32)section->vma;
>> dso->text_offset += (u32)bfd_asymbol_value(symbols[i]);
>> + dso->text_end = (section->vma - dso->text_offset) + section->size;
>> } else {
>> dso->text_offset = section->vma - section->filepos;
>> + dso->text_end = section->filepos + section->size;
>> }
>> }




Re: [V2 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-14 Thread Athira Rajeev



> On 14-Sep-2023, at 5:47 PM, Disha Goel  wrote:
> 
> On 07/09/23 10:15 pm, Athira Rajeev wrote:
>> The testcase "Object code reading" fails in somecases
>> for "fs_something" sub test as below:
>> 
>> Reading object code for memory address: 0xc00807f0142c
>> File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>> On file address is: 0x1114cc
>> Objdump command is: objdump -z -d --start-address=0x11142c 
>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
>> objdump read too few bytes: 128
>> test child finished with -1
>> 
>> This can alo be reproduced when running perf record with
>> workload that exercises fs_something() code. In the test
>> setup, this is exercising xfs code since root is xfs.
>> 
>> # perf record ./a.out
>> # perf report -v |grep "xfs.ko"
>> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807de5efc 
>> B [k] xlog_cil_commit
>> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807d5ae18 
>> B [k] xfs_btree_key_offset
>> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807e11fd4 
>> B [k] 0x00112074
>> 
>> Here addr "0xc00807e11fd4" is not resolved. since this is a
>> kernel module, its offset is from the DSO. Xfs module is loaded
>> at 0xc00807d0
>> 
>> # cat /proc/modules | grep xfs
>> xfs 2228224 3 - Live 0xc00807d0
>> 
>> And size is 0x22. So its loaded between ߦ0xc00807d0
>> and 0xc00807f2. From objdump, text section is:
>> text 0010f7bc   00a0 2**4
>> 
>> Hence perf captured ip maps to 0x112074 which is:
>> ( ip - start of module ) + a0
>> 
>> This offset 0x112074 falls out .text section which is up to 0x10f7bc
>> In this case for module, the address 0xc00807e11fd4 is pointing
>> to stub instructions. This address range represents the module stubs
>> which is allocated on module load and hence is not part of DSO offset.
>> 
>> To address this issue in "object code reading", skip the sample if
>> address falls out of text section and is within the module end.
>> Use the "text_end" member of "struct dso" to do this check.
>> 
>> To address this issue in "perf report", exploring an option of
>> having stubs range as part of the /proc/kallsyms, so that perf
>> report can resolve addresses in stubs range
>> 
>> However this patch uses text_end to skip the stub range for
>> Object code reading testcase.
>> 
>> Reported-by: Disha Goel 
>> Signed-off-by: Athira Rajeev 
>> ---
>> Changelog:
>> v1 -> v2:
>> Updated comment to add description on which arch has stub and
>> reason for skipping as suggested by Adrian

Thanks for testing Disha.

Hi Adrian,

Can you please review and share feedback on this version.

Thanks
Athira

> With this patch applied perf Object code reading test works correctly.
> 
> 26: Object code reading : Ok
> 
> Tested-by: Disha Goel 
> 
>> tools/perf/tests/code-reading.c | 12 
>> 1 file changed, 12 insertions(+)
>> 
>> diff --git a/tools/perf/tests/code-reading.c 
>> b/tools/perf/tests/code-reading.c
>> index ed3815163d1b..3cf6c2d42416 100644
>> --- a/tools/perf/tests/code-reading.c
>> +++ b/tools/perf/tests/code-reading.c
>> @@ -269,6 +269,18 @@ static int read_object_code(u64 addr, size_t len, u8 
>> cpumode,
>> if (addr + len > map__end(al.map))
>> len = map__end(al.map) - addr;
>> 
>> + /*
>> + * Some architectures (ex: powerpc) have stubs (trampolines) in kernel
>> + * modules to manage long jumps. Check if the ip offset falls in stubs
>> + * sections for kernel modules. And skip module address after text end
>> + */
>> + if (strstr(dso->long_name, ".ko")) {
>> + if (al.addr > dso->text_end) {
>> + pr_debug("skipping the module address %#"PRIx64" after text end\n", 
>> al.addr);
>> + goto out;
>> + }
>> + }
>> +
>> /* Read the object code using perf */
>> ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)),
>> al.addr, buf1, len);
>> 



Re: [PATCH V3] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-14 Thread Athira Rajeev



> On 13-Sep-2023, at 1:06 AM, Arnaldo Carvalho de Melo  wrote:
> 
> Em Tue, Sep 12, 2023 at 07:00:00AM -0700, Ian Rogers escreveu:
>> On Mon, Sep 11, 2023 at 11:38 PM Athira Rajeev
>>  wrote:
>>> 
>>> Makefile.perf uses "CONFIG_*" checks in the code. Example the config
>>> for libtraceevent is used to set PYTHON_EXT_SRCS
>>> 
>>>ifeq ($(CONFIG_LIBTRACEEVENT),y)
>>>  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>>>else
>>>  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
>>> util/python-ext-sources)
>>>endif
>>> 
>>> But this is not picking the value for CONFIG_LIBTRACEEVENT that is
>>> set using the settings in Makefile.config. Include the file
>>> ".config-detected" so that make will use the system detected
>>> configuration in the CONFIG checks. This will fix isues that
>>> could arise when other "CONFIG_*" checks are added to Makefile.perf
>>> in future as well.
>>> 
>>> Signed-off-by: Athira Rajeev 
>> 
>> Reviewed-by: Ian Rogers 
> 
> Thanks, applied.
> 
> - Arnaldo
> 

Thanks Ian for the review and thanks Arnaldo for picking this fix

Athira
> 
>> Thanks,
>> Ian
>> 
>>> ---
>>> Changelog:
>>> v2 -> v3:
>>> Added -include since in some cases make clean or make
>>> will fail when config is not included and if config-detected
>>> file is not present.
>>> 
>>> v1 -> v2:
>>> Added $(OUTPUT) prefix to config-detected as pointed
>>> out by Ian
>>> 
>>> tools/perf/Makefile.perf | 3 +++
>>> 1 file changed, 3 insertions(+)
>>> 
>>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>>> index 37af6df7b978..f6fdc2d5a92f 100644
>>> --- a/tools/perf/Makefile.perf
>>> +++ b/tools/perf/Makefile.perf
>>> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
>>> 
>>> python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
>>> $(OUTPUT)python/perf*.so
>>> 
>>> +# Use the detected configuration
>>> +-include $(OUTPUT).config-detected
>>> +
>>> ifeq ($(CONFIG_LIBTRACEEVENT),y)
>>>   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>>> else
>>> --
>>> 2.31.1
>>> 
> 
> -- 
> 
> - Arnaldo




[PATCH 2/2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf

2023-09-14 Thread Athira Rajeev
Add rule in new Makefile "tests/Makefile.tests" for running
shellcheck on shell test scripts. This automates below shellcheck
into the build.

$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck 
-S warning $F; done

CONFIG_SHELLCHECK check is added to avoid build breakage in
the absence of shellcheck binary. Update Makefile.perf to contain
new rule for "SHELLCHECK_TEST" which is for making shellcheck
test as a dependency on perf binary. Added "tests/Makefile.tests"
to run shellcheck on shellscripts in tests/shell. The make rule
"SHLLCHECK_RUN" ensures that, every time during make, shellcheck
will be run only on modified files during subsequent invocations.
By this, if any newly added shell scripts or fixes in existing
scripts breaks coding/formatting style, it will get captured
during the perf build.

Signed-off-by: Athira Rajeev 
---
 tools/perf/Makefile.perf| 12 +++-
 tools/perf/tests/Makefile.tests | 24 
 2 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/tests/Makefile.tests

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index f6fdc2d5a92f..c27f54771e90 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -667,7 +667,16 @@ $(PERF_IN): prepare FORCE
 $(PMU_EVENTS_IN): FORCE prepare
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events 
obj=pmu-events
 
-$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN)
+# Runs shellcheck on perf test shell scripts
+ifeq ($(CONFIG_SHELLCHECK),y)
+SHELLCHECK_TEST: FORCE prepare
+   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests
+else
+SHELLCHECK_TEST:
+   @:
+endif
+
+$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST
$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \
$(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
 
@@ -1129,6 +1138,7 @@ bpf-skel-clean:
$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS)
 
 clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean 
$(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean 
tests-coresight-targets-clean
+   $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean
$(call QUIET_CLEAN, core-objs)  $(RM) $(LIBPERF_A) 
$(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS)
$(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' 
-delete -o -name '\.*.d' -delete
$(Q)$(RM) $(OUTPUT).config-detected
diff --git a/tools/perf/tests/Makefile.tests b/tools/perf/tests/Makefile.tests
new file mode 100644
index ..e74575559e83
--- /dev/null
+++ b/tools/perf/tests/Makefile.tests
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0
+# Athira Rajeev , 2023
+-include $(OUTPUT).config-detected
+
+log_file = $(OUTPUT)shellcheck_test.log
+PROGS = $(subst ./,,$(shell find tests/shell -perm -o=x -type f -name '*.sh'))
+DEPS = $(addprefix output/,$(addsuffix .dep,$(basename $(PROGS
+DIRS = $(shell echo $(dir $(DEPS)) | xargs -n1 | sort -u | xargs)
+
+.PHONY: all
+all: SHELLCHECK_RUN
+   @:
+
+SHELLCHECK_RUN: $(DEPS) $(DIRS)
+
+output/%.dep: %.sh | $(DIRS)
+   $(call rule_mkdir)
+   $(Q)$(call frecho-cmd,test)@touch $@
+   $(Q)$(call frecho-cmd,test)@shellcheck -S warning $(subst 
output/,./,$(patsubst %.dep, %.sh, $@)) 1> ${log_file} && ([[ ! -s ${log_file} 
]])
+$(DIRS):
+   @mkdir -p $@
+
+clean:
+   @rm -rf $(log_file) output
-- 
2.31.1



[PATCH 1/2] tools/perf: Add new CONFIG_SHELLCHECK for detecting shellcheck binary

2023-09-14 Thread Athira Rajeev
shellcheck tool can detect coding/formatting issues on
shell scripts. In perf directory "tests/shell", there are lot
of shell test scripts and this tool can detect coding/formatting
issues on these scripts.

Example to use shellcheck for severity level for
errors and warnings, below command is used:

   # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
warning $F; done
   # echo $?
 0

This testing needs to be automated into the build so that it
can avoid regressions and also run the check for newly added
during build test itself. Add a new feature check to detect
presence of shellcheck. Add CONFIG_SHELLCHECK feature check in
the build to avoid not having shellcheck breaking the build.

Signed-off-by: Athira Rajeev 
---
 tools/build/Makefile.feature |  6 --
 tools/build/feature/Makefile |  8 +++-
 tools/perf/Makefile.config   | 10 ++
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 934e2777a2db..23f56b95babf 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -72,7 +72,8 @@ FEATURE_TESTS_BASIC :=  \
 libzstd\
 disassembler-four-args \
 disassembler-init-styled   \
-file-handle
+file-handle\
+shellcheck
 
 # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
 # of all feature tests
@@ -138,7 +139,8 @@ FEATURE_DISPLAY ?=  \
  get_cpuid  \
  bpf   \
  libaio\
- libzstd
+ libzstd   \
+ shellcheck
 
 #
 # Declare group members of a feature to display the logical OR of the detection
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 3184f387990a..44ba6d0c98d0 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -76,7 +76,8 @@ FILES=  \
  test-libzstd.bin  \
  test-clang-bpf-co-re.bin  \
  test-file-handle.bin  \
- test-libpfm4.bin
+ test-libpfm4.bin  \
+ test-shellcheck.bin
 
 FILES := $(addprefix $(OUTPUT),$(FILES))
 
@@ -92,6 +93,8 @@ __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ $(patsubst 
%.bin,%.c,$(@F)) $(
 __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst 
%.bin,%.cpp,$(@F)) $(LDFLAGS)
   BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1
 
+  BUILD_BINARY = sh -c $1 > $(@:.bin=.make.output) 2>&1
+
 ###
 
 $(OUTPUT)test-all.bin:
@@ -207,6 +210,9 @@ $(OUTPUT)test-libslang-include-subdir.bin:
 $(OUTPUT)test-libtraceevent.bin:
$(BUILD) -ltraceevent
 
+$(OUTPUT)test-shellcheck.bin:
+   $(BUILD_BINARY) "shellcheck --version"
+
 $(OUTPUT)test-libtracefs.bin:
 $(BUILD) $(shell $(PKG_CONFIG) --cflags libtraceevent 2>/dev/null) 
-ltracefs
 
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index d66b52407e19..e71fe95ad865 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -779,6 +779,16 @@ ifndef NO_SLANG
   endif
 endif
 
+ifneq ($(NO_SHELLCHECK),1)
+  $(call feature_check,shellcheck)
+  ifneq ($(feature-shellcheck), 1)
+msg := $(warning No shellcheck found. please install ShellCheck);
+  else
+$(call detected,CONFIG_SHELLCHECK)
+NO_SHELLCHECK := 0
+  endif
+endif
+
 ifdef GTK2
   FLAGS_GTK2=$(CFLAGS) $(LDFLAGS) $(EXTLIBS) $(shell $(PKG_CONFIG) --libs 
--cflags gtk+-2.0 2>/dev/null)
   $(call feature_check,gtk2)
-- 
2.31.1



[PATCH 3/3] skiboot: Update IMC PMU node names for power10

2023-09-14 Thread Athira Rajeev
The nest IMC (In Memory Collection) Performance Monitoring
Unit(PMU) node names are saved as "struct nest_pmus_struct"
in the "hw/imc.c" IMC code. Not all the IMC PMUs listed in
the device tree may be available. Nest IMC PMU names along with
their bit values is represented in imc availability vector.
This struct is used to remove the unavailable nodes by checking
this vector.

For power10, the imc_chip_avl_vector ie, imc availability vector
( which is a part of the IMC control block structure ), has
change in mapping of units and bit positions. Hence rename the
existing nest_pmus array to nest_pmus_p9 and add entry for power10
as nest_pmus_p10.

Also the avl_vector has another change in bit positions 11:34. These
bit positions tells the availability of Xlink/Alink/CAPI. There
are total 8 links and three bit field combination says which link
is available. Patch implements all these change to handle
nest_pmus_p10.

Signed-off-by: Athira Rajeev 
---
Changelog:
v5 -> v6:
- Addressed review comment from Reza by using PPC_BIT
  instead of PPC_BITMASK

v4 -> v5:
- Addressed review comment from Reza and renamed
  dt_find_by_name_substr to dt_find_by_name_before_addr

v3 -> v4:
- Addressed review comment from Mahesh and added his Reviewed-by
  for patch 1.

v2 -> v3:
- After review comments from Mahesh, fixed the code
  to consider string upto "@" for both input node name
  as well as child node name. V2 version was comparing
  input node name and child node name upto string length
  of child name. But this will return wrong node if input
  name is larger than child name. Because it will match
  as substring for child name.
  https://lists.ozlabs.org/pipermail/skiboot/2023-January/018596.html

v1 -> v2:
- Addressed review comment from Dan to update
  the utility funtion to search and compare
  upto "@". Renamed it as dt_find_by_name_substr.

 hw/imc.c | 196 ---
 1 file changed, 186 insertions(+), 10 deletions(-)

diff --git a/hw/imc.c b/hw/imc.c
index 73f25dae8..9f59348ad 100644
--- a/hw/imc.c
+++ b/hw/imc.c
@@ -49,7 +49,7 @@ static unsigned int *htm_scom_index;
  * imc_chip_avl_vector(in struct imc_chip_cb, look at include/imc.h).
  * nest_pmus[] is an array containing all the possible nest IMC PMU node names.
  */
-static char const *nest_pmus[] = {
+static const char *nest_pmus_p9[] = {
"powerbus0",
"mcs0",
"mcs1",
@@ -104,6 +104,67 @@ static char const *nest_pmus[] = {
/* reserved bits : 51 - 63 */
 };
 
+static const char *nest_pmus_p10[] = {
+   "pb",
+   "mcs0",
+   "mcs1",
+   "mcs2",
+   "mcs3",
+   "mcs4",
+   "mcs5",
+   "mcs6",
+   "mcs7",
+   "pec0",
+   "pec1",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "NA",
+   "phb0",
+   "phb1",
+   "phb2",
+   "phb3",
+   "phb4",
+   "phb5",
+   "ocmb0",
+   "ocmb1",
+   "ocmb2",
+   "ocmb3",
+   "ocmb4",
+   "ocmb5",
+   "ocmb6",
+   "ocmb7",
+   "ocmb8",
+   "ocmb9",
+   "ocmb10",
+   "ocmb11",
+   "ocmb12",
+   "ocmb13",
+   "ocmb14",
+   "ocmb15",
+   "nx",
+};
+
 /*
  * Due to Nest HW/OCC restriction, microcode will not support individual unit
  * events for these nest units mcs0, mcs1 ... mcs7 in the accumulation mode.
@@ -371,7 +432,7 @@ static void disable_unavailable_units(struct dt_node *dev)
uint64_t avl_vec;
struct imc_chip_cb *cb;
struct dt_node *target;
-   int i;
+   int i, j;
bool disable_all_nests = false;
struct proc_chip *chip;
 
@@ -409,14 +470,129 @@ static void disable_unavailable_units(struct dt_node 
*dev)
avl_vec = (0xffULL) << 56;
}
 
-   for (i = 0; i < ARRAY_SIZE(nest_pmus); i++) {
-   if (!(PPC_BITMASK(i, i) & avl_vec)) {
-   /* Check if the device node exists */
-   target = dt_find_by_name_before_addr(dev, nest_pmus[i]);
-   

[PATCH 2/3] skiboot: Update IMC code to use dt_find_by_name_before_addr for checking dt nodes

2023-09-14 Thread Athira Rajeev
The nest IMC (In Memory Collection) Performance Monitoring
Unit(PMU) node names are saved in nest_pmus[] array in the
"hw/imc.c" IMC code. Not all the IMC PMUs listed in the device
tree may be available. Nest IMC PMU names along with their
bit values is represented in imc availability vector.
The nest_pmus[] array is used to remove the unavailable nodes
by checking this vector.

To check node availability, code was using "dt_find_by_substr".
But since the node names have format like: "name@offset",
dt_find_by_name doesn't return the expected result. Fix this
by using dt_find_by_name_before_addr. Also, update the char array
to use correct node names.

Signed-off-by: Athira Rajeev 
---
Changelog:
v4 -> v5:
- Addressed review comment from Reza and renamed
  dt_find_by_name_substr to dt_find_by_name_before_addr

v3 -> v4:
- Addressed review comment from Mahesh and added his Reviewed-by
  for patch 1.

v2 -> v3:
- After review comments from Mahesh, fixed the code
  to consider string upto "@" for both input node name
  as well as child node name. V2 version was comparing
  input node name and child node name upto string length
  of child name. But this will return wrong node if input
  name is larger than child name. Because it will match
  as substring for child name.
  https://lists.ozlabs.org/pipermail/skiboot/2023-January/018596.html

v1 -> v2:
- Addressed review comment from Dan to update
  the utility funtion to search and compare
  upto "@". Renamed it as dt_find_by_name_substr.

 hw/imc.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/imc.c b/hw/imc.c
index 97e0809f0..73f25dae8 100644
--- a/hw/imc.c
+++ b/hw/imc.c
@@ -67,14 +67,14 @@ static char const *nest_pmus[] = {
"mba5",
"mba6",
"mba7",
-   "cen0",
-   "cen1",
-   "cen2",
-   "cen3",
-   "cen4",
-   "cen5",
-   "cen6",
-   "cen7",
+   "centaur0",
+   "centaur1",
+   "centaur2",
+   "centaur3",
+   "centaur4",
+   "centaur5",
+   "centaur6",
+   "centaur7",
"xlink0",
"xlink1",
"xlink2",
@@ -412,7 +412,7 @@ static void disable_unavailable_units(struct dt_node *dev)
for (i = 0; i < ARRAY_SIZE(nest_pmus); i++) {
if (!(PPC_BITMASK(i, i) & avl_vec)) {
/* Check if the device node exists */
-   target = dt_find_by_name(dev, nest_pmus[i]);
+   target = dt_find_by_name_before_addr(dev, nest_pmus[i]);
if (!target)
continue;
/* Remove the device node */
-- 
2.31.1



[PATCH 1/3] core/device: Add function to return child node using name at substring "@"

2023-09-14 Thread Athira Rajeev
Add a function dt_find_by_name_before_addr() that returns the child node if
it matches till first occurrence at "@" of a given name, otherwise NULL.
This is helpful for cases with node name like: "name@addr". In
scenarios where nodes are added with "name@addr" format and if the
value of "addr" is not known, that node can't be matched with node
name or addr. Hence matching with substring as node name will return
the expected result. Patch adds dt_find_by_name_before_addr() function
and testcase for the same in core/test/run-device.c

Signed-off-by: Athira Rajeev 
Reviewed-by: Mahesh Salgaonkar 
---
Changelog:
v5 -> v6:
- Addressed review comment from Reza. Instead of using new
  variable for "node", use the node "name" as-is since the
  utility is to check the name before addr. Updated the
  test/run-device.c accordingly

v4 -> v5:
- Addressed review comment from Reza and renamed
  dt_find_by_name_substr to dt_find_by_name_before_addr

v3 -> v4:
- Addressed review comment from Mahesh and added his Reviewed-by.

v2 -> v3:
- After review comments from Mahesh, fixed the code
  to consider string upto "@" for both input node name
  as well as child node name. V2 version was comparing
  input node name and child node name upto string length
  of child name. But this will return wrong node if input
  name is larger than child name. Because it will match
  as substring for child name.
  https://lists.ozlabs.org/pipermail/skiboot/2023-January/018596.html

v1 -> v2:
- Addressed review comment from Dan to update
  the utility funtion to search and compare
  upto "@". Renamed it as dt_find_by_name_substr.

 core/device.c  | 25 +
 core/test/run-device.c | 14 ++
 include/device.h   |  3 +++
 3 files changed, 42 insertions(+)

diff --git a/core/device.c b/core/device.c
index 2de37c741..c22b6b3c3 100644
--- a/core/device.c
+++ b/core/device.c
@@ -395,6 +395,31 @@ struct dt_node *dt_find_by_name(struct dt_node *root, 
const char *name)
 }
 
 
+struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const char 
*name)
+{
+   struct dt_node *child, *match;
+   char *child_node = NULL;
+
+   list_for_each(>children, child, list) {
+   child_node = strdup(child->name);
+   if (!child_node)
+   goto err;
+   child_node = strtok(child_node, "@");
+   if (!strcmp(child_node, name)) {
+   free(child_node);
+   return child;
+   }
+
+   match = dt_find_by_name_before_addr(child, name);
+   if (match)
+   return match;
+   }
+
+   free(child_node);
+err:
+   return NULL;
+}
+
 struct dt_node *dt_new_check(struct dt_node *parent, const char *name)
 {
struct dt_node *node = dt_find_by_name(parent, name);
diff --git a/core/test/run-device.c b/core/test/run-device.c
index 4a12382bb..fb7a7d2c0 100644
--- a/core/test/run-device.c
+++ b/core/test/run-device.c
@@ -466,6 +466,20 @@ int main(void)
new_prop_ph = dt_prop_get_u32(ut2, "something");
assert(!(new_prop_ph == ev1_ph));
dt_free(subtree);
+
+   /* Test dt_find_by_name_before_addr */
+   root = dt_new_root("");
+   addr1 = dt_new_addr(root, "node", 0x1);
+   addr2 = dt_new_addr(root, "node0_1", 0x2);
+   assert(dt_find_by_name(root, "node@1") == addr1);
+   assert(dt_find_by_name(root, "node0_1@2") == addr2);
+   assert(dt_find_by_name_before_addr(root, "node") == addr1);
+   assert(dt_find_by_name_before_addr(root, "node0_") == NULL);
+   assert(dt_find_by_name_before_addr(root, "node0_1") == addr2);
+   assert(dt_find_by_name_before_addr(root, "node0") == NULL);
+   assert(dt_find_by_name_before_addr(root, "node0_") == NULL);
+   dt_free(root);
+
return 0;
 }
 
diff --git a/include/device.h b/include/device.h
index 93fb90ff4..f2402cc4d 100644
--- a/include/device.h
+++ b/include/device.h
@@ -184,6 +184,9 @@ struct dt_node *dt_find_by_path(struct dt_node *root, const 
char *path);
 /* Find a child node by name */
 struct dt_node *dt_find_by_name(struct dt_node *root, const char *name);
 
+/* Find a child node by name and substring */
+struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const char 
*name);
+
 /* Find a node by phandle */
 struct dt_node *dt_find_by_phandle(struct dt_node *root, u32 phandle);
 
-- 
2.31.1



Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events

2023-09-12 Thread Athira Rajeev



> On 08-Sep-2023, at 7:48 PM, Athira Rajeev  wrote:
> 
> 
> 
>> On 08-Sep-2023, at 11:04 AM, Sachin Sant  wrote:
>> 
>> 
>> 
>>> On 07-Sep-2023, at 10:29 PM, Athira Rajeev  
>>> wrote:
>>> 
>>> Testcase "Parsing of all PMU events from sysfs" parse events for
>>> all PMUs, and not just cpu. In case of powerpc, the PowerVM
>>> environment supports events from hv_24x7 and hv_gpci PMU which
>>> is of example format like below:
>>> 
>>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
>>> - hv_gpci/event,partition_id=?/
>>> 
>>> The value for "?" needs to be filled in depending on system
>>> configuration. It is better to skip these parametrized events
>>> in this test as it is done in:
>>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
>>> parametrized events")' which handled a simialr instance with
>>> "all PMU test".
>>> 
>>> Fix parse-events test to skip parametrized events since
>>> it needs proper setup of the parameters.
>>> 
>>> Signed-off-by: Athira Rajeev 
>>> ---
>>> Changelog:
>>> v1 -> v2:
>>> Addressed review comments from Ian. Updated size of
>>> pmu event name variable and changed bool name which is
>>> used to skip the test.
>>> 
>> 
>> The patch fixes the reported issue.
>> 
>> 6.2: Parsing of all PMU events from sysfs  : Ok
>> 6.3: Parsing of given PMU events from sysfs: Ok
>> 
>> Tested-by: Sachin Sant 
>> 
>> - Sachin
> 
> Hi Sachin, Ian
> 
> Thanks for testing the patch

Hi Arnaldo

Can you please check and pull this if it looks good to go .

Thanks
Athira
> 
> Athira
> 
> 



Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-12 Thread Athira Rajeev



> On 08-Sep-2023, at 7:47 PM, Athira Rajeev  wrote:
> 
> 
> 
>> On 08-Sep-2023, at 5:20 AM, Ian Rogers  wrote:
>> 
>> On Thu, Sep 7, 2023 at 10:17 AM Athira Rajeev
>>  wrote:
>>> 
>>> From: root 
>>> 
>>> shellcheck was run on perf tool shell scripts s a pre-requisite
>>> to include a build option for shellcheck discussed here:
>>> https://www.spinics.net/lists/linux-perf-users/msg25553.html
>>> 
>>> And fixes were added for the coding/formatting issues in
>>> two patchsets:
>>> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
>>> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/
>>> 
>>> Three additional issues are observed with shellcheck "0.6" and
>>> this patchset covers those. With this patchset,
>>> 
>>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>>> warning $F; done
>>> # echo $?
>>> 0
>>> 
>>> Athira Rajeev (3):
>>> tests/shell: Fix shellcheck SC1090 to handle the location of sourced
>>>   files
>>> tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh
>>>   tetscase
>>> tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts
>> 
>> Series:
>> Tested-by: Ian Rogers 
>> 
>> Thanks,
>> Ian
> 
> Thanks Ian for checking the patch series
> 
> Athira

Hi Arnaldo

Can you please check and pull this if it looks good to go .

Thanks
Athira

>> 
>>> tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 
>>> tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 
>>> tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 
>>> tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 4 
>>> tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 
>>> tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
>>> tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
>>> tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
>>> tools/perf/tests/shell/record.sh | 1 +
>>> tools/perf/tests/shell/stat+csv_output.sh| 1 +
>>> tools/perf/tests/shell/stat+csv_summary.sh   | 4 ++--
>>> tools/perf/tests/shell/stat+shadow_stat.sh   | 4 ++--
>>> tools/perf/tests/shell/stat+std_output.sh| 1 +
>>> tools/perf/tests/shell/test_intel_pt.sh  | 1 +
>>> tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
>>> 15 files changed, 35 insertions(+), 4 deletions(-)
>>> 
>>> --
>>> 2.31.1




[PATCH V3] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-12 Thread Athira Rajeev
Makefile.perf uses "CONFIG_*" checks in the code. Example the config
for libtraceevent is used to set PYTHON_EXT_SRCS

ifeq ($(CONFIG_LIBTRACEEVENT),y)
  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
else
  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
util/python-ext-sources)
endif

But this is not picking the value for CONFIG_LIBTRACEEVENT that is
set using the settings in Makefile.config. Include the file
".config-detected" so that make will use the system detected
configuration in the CONFIG checks. This will fix isues that
could arise when other "CONFIG_*" checks are added to Makefile.perf
in future as well.

Signed-off-by: Athira Rajeev 
---
Changelog:
v2 -> v3:
Added -include since in some cases make clean or make
will fail when config is not included and if config-detected
file is not present.

v1 -> v2:
Added $(OUTPUT) prefix to config-detected as pointed
out by Ian

 tools/perf/Makefile.perf | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 37af6df7b978..f6fdc2d5a92f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
 
 python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
$(OUTPUT)python/perf*.so
 
+# Use the detected configuration
+-include $(OUTPUT).config-detected
+
 ifeq ($(CONFIG_LIBTRACEEVENT),y)
   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
 else
-- 
2.31.1



Re: [PATCH V2] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-12 Thread Athira Rajeev



> On 08-Sep-2023, at 9:45 PM, Ian Rogers  wrote:
> 
> On Fri, Sep 8, 2023 at 7:51 AM Athira Rajeev
>  wrote:
>> 
>> Makefile.perf uses "CONFIG_*" checks in the code. Example the config
>> for libtraceevent is used to set PYTHON_EXT_SRCS
>> 
>>ifeq ($(CONFIG_LIBTRACEEVENT),y)
>>  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>>else
>>  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
>> util/python-ext-sources)
>>endif
>> 
>> But this is not picking the value for CONFIG_LIBTRACEEVENT that is
>> set using the settings in Makefile.config. Include the file
>> ".config-detected" so that make will use the system detected
>> configuration in the CONFIG checks. This will fix isues that
>> could arise when other "CONFIG_*" checks are added to Makefile.perf
>> in future as well.
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>> Changelog:
>> v1 -> v2:
>> Added $(OUTPUT) prefix to config-detected as pointed
>> out by Ian
>> 
>> tools/perf/Makefile.perf | 3 +++
>> 1 file changed, 3 insertions(+)
>> 
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index 37af6df7b978..66b9dc61c32f 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
>> 
>> python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
>> $(OUTPUT)python/perf*.so
>> 
>> +# Use the detected configuration
>> +include $(OUTPUT).config-detected
> 
> The Makefile.build version also has a "-include" rather than "include"
> in case the .config-detected file is missing. In Makefile.perf
> including Makefile.config is optional:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n253
> 
> and there are certain targets that where we don't include it:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n200
> 
> So playing devil's advocate, if we ran "make clean" we'd remove
> .config-detected:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n1131
> 
> If we then ran "make tags" then we wouldn't include Makefile.config
> and so .config-detected wouldn't be generated and I think the build
> would fail due to a missing include here. So I think this should be
> -include or perhaps:

Hi Ian

Thanks for checking in detail. Yes, make clean in perf fails with just “include”

# make clean
Makefile.perf:355: .config-detected: No such file or directory
make[1]: *** No rule to make target '.config-detected'.  Stop.
make: *** [Makefile:90: clean] Error 2


Below change will be correct as you pointed:

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 66b9dc61c32f..f6fdc2d5a92f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -352,7 +352,7 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
 python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
$(OUTPUT)python/perf*.so
   # Use the detected configuration
-include $(OUTPUT).config-detected
+-include $(OUTPUT).config-detected
   ifeq ($(CONFIG_LIBTRACEEVENT),y)
   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)


I could test to make sure it includes the file when it is present and picks the 
detected configs correctly as well with this change.
Adding this change in V3 

Thanks
Athira
> 
> ifeq ($(config),1)
> include $(OUTPUT).config-detected
> endif
> 
> Thanks,
> Ian
> 
>> +
>> ifeq ($(CONFIG_LIBTRACEEVENT),y)
>>   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>> else
>> --
>> 2.31.1




Re: [PATCH V5 3/3] skiboot: Update IMC PMU node names for power10

2023-09-10 Thread Athira Rajeev



> On 10-Aug-2023, at 3:28 AM, Reza Arbab  wrote:
> 
> On Mon, Jul 17, 2023 at 08:54:31AM +0530, Athira Rajeev wrote:
>> @@ -408,14 +469,129 @@ static void disable_unavailable_units(struct dt_node 
>> *dev)
>> avl_vec = (0xffULL) << 56;
>> }
>> 
>> - for (i = 0; i < ARRAY_SIZE(nest_pmus); i++) {
>> - if (!(PPC_BITMASK(i, i) & avl_vec)) {
>> - /* Check if the device node exists */
>> - target = dt_find_by_name_before_addr(dev, nest_pmus[i]);
>> - if (!target)
>> - continue;
>> - /* Remove the device node */
>> - dt_free(target);
>> + if (proc_gen == proc_gen_p9) {
>> + for (i = 0; i < ARRAY_SIZE(nest_pmus_p9); i++) {
>> + if (!(PPC_BITMASK(i, i) & avl_vec)) {
> 
> I think all these PPC_BITMASK(i, i) can be changed to PPC_BIT(i).

Hi Reza,

Thanks for reviewing the changes.
Yes. I will add the change in next version

Thanks
Athira
> 
>> + /* Check if the device node exists */
>> + target = dt_find_by_name_before_addr(dev, nest_pmus_p9[i]);
>> + if (!target)
>> + continue;
>> + /* Remove the device node */
>> + dt_free(target);
>> + }
>> + }
>> + } else if (proc_gen == proc_gen_p10) {
>> + int val;
>> + char name[8];
>> +
>> + for (i = 0; i < 11; i++) {
>> + if (!(PPC_BITMASK(i, i) & avl_vec)) {
>> + /* Check if the device node exists */
>> + target = dt_find_by_name_before_addr(dev, nest_pmus_p10[i]);
>> + if (!target)
>> + continue;
>> + /* Remove the device node */
>> + dt_free(target);
>> + }
>> + }
>> +
>> + for (i = 35; i < 41; i++) {
>> + if (!(PPC_BITMASK(i, i) & avl_vec)) {
>> + /* Check if the device node exists for phb */
>> + for (j = 0; j < 3; j++) {
>> + snprintf(name, sizeof(name), "phb%d_%d", (i-35), j);
>> + target = dt_find_by_name_before_addr(dev, name);
>> + if (!target)
>> + continue;
>> + /* Remove the device node */
>> + dt_free(target);
>> + }
>> + }
>> + }
>> +
>> + for (i = 41; i < 58; i++) {
>> + if (!(PPC_BITMASK(i, i) & avl_vec)) {
>> + /* Check if the device node exists */
>> + target = dt_find_by_name_before_addr(dev, nest_pmus_p10[i]);
>> + if (!target)
>> + continue;
>> + /* Remove the device node */
>> + dt_free(target);
>> + }
>> + }
> 
> -- 
> Reza Arbab



Re: [PATCH V5 1/3] core/device: Add function to return child node using name at substring "@"

2023-09-10 Thread Athira Rajeev



> On 10-Aug-2023, at 3:21 AM, Reza Arbab  wrote:
> 
> Hi Athira,
> 
> I still have a couple of the same questions I asked in v4.
> 
> On Mon, Jul 17, 2023 at 08:54:29AM +0530, Athira Rajeev wrote:
>> Add a function dt_find_by_name_before_addr() that returns the child node if
>> it matches till first occurrence at "@" of a given name, otherwise NULL.
> 
> Given this summary, I don't userstand the following:
> 
>> + assert(dt_find_by_name(root, "node@1") == addr1);
>> + assert(dt_find_by_name(root, "node0_1@2") == addr2);
> 
> Is this behavior required? I don't think it makes sense to call this function 
> with a second argument containing '@', so I wouldn't expect it to match 
> anything in these cases. The function seems to specifically enable it:

Hi Reza,

Yes makes sense. dt_find_by_name can be removed in this test since its 
intention is to find device by name.
I will remove these two checks.

> 
>> +struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const 
>> char *name)
>> +{
> [snip]
>> + node = strdup(name);
>> + if (!node)
>> + return NULL;
>> + node = strtok(node, "@");
> 
> Seems like you could get rid of this and just use name as-is.

Ok Reza
> 
> I was curious about something else; say we have 'node@1' and 'node@2'.  Is 
> there an expectation of which it should match?
> 
>addr1 = dt_new_addr(root, "node", 0x1);
>addr2 = dt_new_addr(root, "node", 0x2);
>assert(dt_find_by_name_substr(root, "node") == ???);
>   ^^^

In this case, dt_find_by_name_before_addr is not the right function to use.
We have other functions like dt_find_by_name_addr that can be made use of.

I will address other changes in next version

Thanks
Athira
> 
> -- 
> Reza Arbab



[PATCH V2] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-08 Thread Athira Rajeev
Makefile.perf uses "CONFIG_*" checks in the code. Example the config
for libtraceevent is used to set PYTHON_EXT_SRCS

ifeq ($(CONFIG_LIBTRACEEVENT),y)
  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
else
  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
util/python-ext-sources)
endif

But this is not picking the value for CONFIG_LIBTRACEEVENT that is
set using the settings in Makefile.config. Include the file
".config-detected" so that make will use the system detected
configuration in the CONFIG checks. This will fix isues that
could arise when other "CONFIG_*" checks are added to Makefile.perf
in future as well.

Signed-off-by: Athira Rajeev 
---
Changelog:
 v1 -> v2:
 Added $(OUTPUT) prefix to config-detected as pointed
 out by Ian

 tools/perf/Makefile.perf | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 37af6df7b978..66b9dc61c32f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
 
 python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
$(OUTPUT)python/perf*.so
 
+# Use the detected configuration
+include $(OUTPUT).config-detected
+
 ifeq ($(CONFIG_LIBTRACEEVENT),y)
   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
 else
-- 
2.31.1



Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events

2023-09-08 Thread Athira Rajeev



> On 08-Sep-2023, at 11:04 AM, Sachin Sant  wrote:
> 
> 
> 
>> On 07-Sep-2023, at 10:29 PM, Athira Rajeev  
>> wrote:
>> 
>> Testcase "Parsing of all PMU events from sysfs" parse events for
>> all PMUs, and not just cpu. In case of powerpc, the PowerVM
>> environment supports events from hv_24x7 and hv_gpci PMU which
>> is of example format like below:
>> 
>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
>> - hv_gpci/event,partition_id=?/
>> 
>> The value for "?" needs to be filled in depending on system
>> configuration. It is better to skip these parametrized events
>> in this test as it is done in:
>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
>> parametrized events")' which handled a simialr instance with
>> "all PMU test".
>> 
>> Fix parse-events test to skip parametrized events since
>> it needs proper setup of the parameters.
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>> Changelog:
>> v1 -> v2:
>> Addressed review comments from Ian. Updated size of
>> pmu event name variable and changed bool name which is
>> used to skip the test.
>> 
> 
> The patch fixes the reported issue.
> 
> 6.2: Parsing of all PMU events from sysfs  : Ok
> 6.3: Parsing of given PMU events from sysfs: Ok
> 
> Tested-by: Sachin Sant 
> 
> - Sachin

Hi Sachin, Ian

Thanks for testing the patch

Athira




Re: [PATCH] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-08 Thread Athira Rajeev



> On 08-Sep-2023, at 4:41 AM, Ian Rogers  wrote:
> 
> On Thu, Sep 7, 2023 at 10:19 AM Athira Rajeev
>  wrote:
>> 
>> Makefile.perf uses "CONFIG_*" checks in the code. Example the config
>> for libtraceevent is used to set PYTHON_EXT_SRCS
>> 
>>ifeq ($(CONFIG_LIBTRACEEVENT),y)
>>  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>>else
>>  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
>> util/python-ext-sources)
>>endif
>> 
>> But this is not picking the value for CONFIG_LIBTRACEEVENT that is
>> set using the settings in Makefile.config. Include the file
>> ".config-detected" so that make will use the system detected
>> configuration in the CONFIG checks. This will fix isues that
>> could arise when other "CONFIG_*" checks are added to Makefile.perf
>> in future as well.
>> 
>> Signed-off-by: Athira Rajeev 
>> ---
>> tools/perf/Makefile.perf | 3 +++
>> 1 file changed, 3 insertions(+)
>> 
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index 37af6df7b978..6764b0e156f4 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
>> 
>> python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
>> $(OUTPUT)python/perf*.so
>> 
>> +# Use the detected configuration
>> +include .config-detected
> 
> Good catch! I think it should look like:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/build/Makefile.build?h=perf-tools-next#n40
> 
> Thanks,
> Ian

Thanks for the review Ian.

Yes, missed the $(OUTPUT) . Will send a V2 with this change

Athira
> 
>> +
>> ifeq ($(CONFIG_LIBTRACEEVENT),y)
>>   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
>> else
>> --
>> 2.31.1




Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-08 Thread Athira Rajeev



> On 08-Sep-2023, at 5:20 AM, Ian Rogers  wrote:
> 
> On Thu, Sep 7, 2023 at 10:17 AM Athira Rajeev
>  wrote:
>> 
>> From: root 
>> 
>> shellcheck was run on perf tool shell scripts s a pre-requisite
>> to include a build option for shellcheck discussed here:
>> https://www.spinics.net/lists/linux-perf-users/msg25553.html
>> 
>> And fixes were added for the coding/formatting issues in
>> two patchsets:
>> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
>> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/
>> 
>> Three additional issues are observed with shellcheck "0.6" and
>> this patchset covers those. With this patchset,
>> 
>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>> warning $F; done
>> # echo $?
>> 0
>> 
>> Athira Rajeev (3):
>>  tests/shell: Fix shellcheck SC1090 to handle the location of sourced
>>files
>>  tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh
>>tetscase
>>  tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts
> 
> Series:
> Tested-by: Ian Rogers 
> 
> Thanks,
> Ian

Thanks Ian for checking the patch series

Athira
> 
>> tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 
>> tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 
>> tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 
>> tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 4 
>> tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 
>> tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
>> tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
>> tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
>> tools/perf/tests/shell/record.sh | 1 +
>> tools/perf/tests/shell/stat+csv_output.sh| 1 +
>> tools/perf/tests/shell/stat+csv_summary.sh   | 4 ++--
>> tools/perf/tests/shell/stat+shadow_stat.sh   | 4 ++--
>> tools/perf/tests/shell/stat+std_output.sh| 1 +
>> tools/perf/tests/shell/test_intel_pt.sh  | 1 +
>> tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
>> 15 files changed, 35 insertions(+), 4 deletions(-)
>> 
>> --
>> 2.31.1




[PATCH] tools/perf: Add includes for detected configs in Makefile.perf

2023-09-07 Thread Athira Rajeev
Makefile.perf uses "CONFIG_*" checks in the code. Example the config
for libtraceevent is used to set PYTHON_EXT_SRCS

ifeq ($(CONFIG_LIBTRACEEVENT),y)
  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
else
  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' 
util/python-ext-sources)
endif

But this is not picking the value for CONFIG_LIBTRACEEVENT that is
set using the settings in Makefile.config. Include the file
".config-detected" so that make will use the system detected
configuration in the CONFIG checks. This will fix isues that
could arise when other "CONFIG_*" checks are added to Makefile.perf
in future as well.

Signed-off-by: Athira Rajeev 
---
 tools/perf/Makefile.perf | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 37af6df7b978..6764b0e156f4 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP
 
 python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) 
$(OUTPUT)python/perf*.so
 
+# Use the detected configuration
+include .config-detected
+
 ifeq ($(CONFIG_LIBTRACEEVENT),y)
   PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
 else
-- 
2.31.1



[PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-07 Thread Athira Rajeev
From: root 

shellcheck was run on perf tool shell scripts s a pre-requisite
to include a build option for shellcheck discussed here:
https://www.spinics.net/lists/linux-perf-users/msg25553.html

And fixes were added for the coding/formatting issues in
two patchsets:
https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/

Three additional issues are observed with shellcheck "0.6" and
this patchset covers those. With this patchset,

# for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
warning $F; done
# echo $?
0

Athira Rajeev (3):
  tests/shell: Fix shellcheck SC1090 to handle the location of sourced
files
  tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh
tetscase
  tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts

 tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 
 tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 
 tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 
 tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 4 
 tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 
 tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
 tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
 tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
 tools/perf/tests/shell/record.sh | 1 +
 tools/perf/tests/shell/stat+csv_output.sh| 1 +
 tools/perf/tests/shell/stat+csv_summary.sh   | 4 ++--
 tools/perf/tests/shell/stat+shadow_stat.sh   | 4 ++--
 tools/perf/tests/shell/stat+std_output.sh| 1 +
 tools/perf/tests/shell/test_intel_pt.sh  | 1 +
 tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
 15 files changed, 35 insertions(+), 4 deletions(-)

-- 
2.31.1



[PATCH 3/3] tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts

2023-09-07 Thread Athira Rajeev
Running shellcheck on some of the shell scripts, throws
below warning on shellcheck v0.6. Example:

   In tests/shell/coresight/asm_pure_loop.sh line 14:
   DATA="$DATD/perf-$TEST-$DATV.data"
  ^---^ SC2153: Possible misspelling: DATD may not be assigned, but 
DATA is.

Here, DATD is exported from "lib/coresight.sh" and this
warning can be ignored. Use "shellcheck disable=" to ignore
this check.

Signed-off-by:  Athira Rajeev 
---
 tools/perf/tests/shell/coresight/asm_pure_loop.sh| 1 +
 tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 1 +
 tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 1 +
 tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 1 +
 tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 1 +
 5 files changed, 5 insertions(+)

diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop.sh 
b/tools/perf/tests/shell/coresight/asm_pure_loop.sh
index 04387061e9f3..2d65defb7e0f 100755
--- a/tools/perf/tests/shell/coresight/asm_pure_loop.sh
+++ b/tools/perf/tests/shell/coresight/asm_pure_loop.sh
@@ -11,6 +11,7 @@ TEST="asm_pure_loop"
 
 ARGS=""
 DATV="out"
+# shellcheck disable=SC2153
 DATA="$DATD/perf-$TEST-$DATV.data"
 
 perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
diff --git a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh 
b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
index c17e442ac741..ddcc9bb850f5 100755
--- a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
+++ b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
@@ -11,6 +11,7 @@ TEST="memcpy_thread"
 
 ARGS="16 10 1"
 DATV="16k_10"
+# shellcheck disable=SC2153
 DATA="$DATD/perf-$TEST-$DATV.data"
 
 perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh 
b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
index e47c4e955d0e..2ce5e139b2fd 100755
--- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
+++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
@@ -11,6 +11,7 @@ TEST="thread_loop"
 
 ARGS="10 1"
 DATV="check-tid-10th"
+# shellcheck disable=SC2153
 DATA="$DATD/perf-$TEST-$DATV.data"
 STDO="$DATD/perf-$TEST-$DATV.stdout"
 
diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh 
b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
index 8bf94a02e384..3ad9498753d7 100755
--- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
+++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
@@ -11,6 +11,7 @@ TEST="thread_loop"
 
 ARGS="2 20"
 DATV="check-tid-2th"
+# shellcheck disable=SC2153
 DATA="$DATD/perf-$TEST-$DATV.data"
 STDO="$DATD/perf-$TEST-$DATV.stdout"
 
diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh 
b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
index 0dc9ef424233..4fbb4a29aad3 100755
--- a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
+++ b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
@@ -11,6 +11,7 @@ TEST="unroll_loop_thread"
 
 ARGS="10"
 DATV="10"
+# shellcheck disable=SC2153
 DATA="$DATD/perf-$TEST-$DATV.data"
 
 perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS
-- 
2.31.1



[PATCH 1/3] tests/shell: Fix shellcheck SC1090 to handle the location of sourced files

2023-09-07 Thread Athira Rajeev
Running shellcheck on some of the shell scripts throws
below error:

In tests/shell/coresight/unroll_loop_thread_10.sh line 8:
. "$(dirname $0)"/../lib/coresight.sh
  ^-- SC1090: Can't follow non-constant source. Use a directive to 
specify location.

This happens on shellcheck version "0.6.0". Fix shellcheck
warning for SC1090 using "shellcheck source="i option to mention
the location of sourced files.

Signed-off-by: Athira Rajeev 
---
 tools/perf/tests/shell/coresight/asm_pure_loop.sh| 3 +++
 tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 3 +++
 tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 3 +++
 tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 3 +++
 tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 3 +++
 tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
 tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
 tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
 tools/perf/tests/shell/record.sh | 1 +
 tools/perf/tests/shell/stat+csv_output.sh| 1 +
 tools/perf/tests/shell/stat+std_output.sh| 1 +
 tools/perf/tests/shell/test_intel_pt.sh  | 1 +
 tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
 13 files changed, 26 insertions(+)

diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop.sh 
b/tools/perf/tests/shell/coresight/asm_pure_loop.sh
index 779bc8608e1e..04387061e9f3 100755
--- a/tools/perf/tests/shell/coresight/asm_pure_loop.sh
+++ b/tools/perf/tests/shell/coresight/asm_pure_loop.sh
@@ -5,7 +5,10 @@
 # Carsten Haitzler , 2021
 
 TEST="asm_pure_loop"
+
+# shellcheck source=../lib/coresight.sh
 . "$(dirname $0)"/../lib/coresight.sh
+
 ARGS=""
 DATV="out"
 DATA="$DATD/perf-$TEST-$DATV.data"
diff --git a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh 
b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
index 08a44e52ce9b..c17e442ac741 100755
--- a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
+++ b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh
@@ -5,7 +5,10 @@
 # Carsten Haitzler , 2021
 
 TEST="memcpy_thread"
+
+# shellcheck source=../lib/coresight.sh
 . "$(dirname $0)"/../lib/coresight.sh
+
 ARGS="16 10 1"
 DATV="16k_10"
 DATA="$DATD/perf-$TEST-$DATV.data"
diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh 
b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
index c83a200dede4..e47c4e955d0e 100755
--- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
+++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh
@@ -5,7 +5,10 @@
 # Carsten Haitzler , 2021
 
 TEST="thread_loop"
+
+# shellcheck source=../lib/coresight.sh
 . "$(dirname $0)"/../lib/coresight.sh
+
 ARGS="10 1"
 DATV="check-tid-10th"
 DATA="$DATD/perf-$TEST-$DATV.data"
diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh 
b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
index 6346fd5e87c8..8bf94a02e384 100755
--- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
+++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh
@@ -5,7 +5,10 @@
 # Carsten Haitzler , 2021
 
 TEST="thread_loop"
+
+# shellcheck source=../lib/coresight.sh
 . "$(dirname $0)"/../lib/coresight.sh
+
 ARGS="2 20"
 DATV="check-tid-2th"
 DATA="$DATD/perf-$TEST-$DATV.data"
diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh 
b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
index 7304e3d3a6ff..0dc9ef424233 100755
--- a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
+++ b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh
@@ -5,7 +5,10 @@
 # Carsten Haitzler , 2021
 
 TEST="unroll_loop_thread"
+
+# shellcheck source=../lib/coresight.sh
 . "$(dirname $0)"/../lib/coresight.sh
+
 ARGS="10"
 DATV="10"
 DATA="$DATD/perf-$TEST-$DATV.data"
diff --git a/tools/perf/tests/shell/probe_vfs_getname.sh 
b/tools/perf/tests/shell/probe_vfs_getname.sh
index 871243d6d03a..554e12e83c55 100755
--- a/tools/perf/tests/shell/probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/probe_vfs_getname.sh
@@ -4,10 +4,12 @@
 # SPDX-License-Identifier: GPL-2.0
 # Arnaldo Carvalho de Melo , 2017
 
+# shellcheck source=lib/probe.sh
 . "$(dirname $0)"/lib/probe.sh
 
 skip_if_no_perf_probe || exit 2
 
+# shellcheck source=lib/probe_vfs_getname.sh
 . "$(dirname $0)"/lib/probe_vfs_getname.sh
 
 add_probe_vfs_getname || skip_if_no_debuginfo
diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
index 89214a6d995

[PATCH 2/3] tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh tetscase

2023-09-07 Thread Athira Rajeev
Running shellcheck on stat+shadow_stat.sh generates below
warning

In tests/shell/stat+csv_summary.sh line 26:
while read _num _event _run _pct
   ^--^ SC2034: _num appears unused. Verify use (or export if used 
externally).
^^ SC2034: _event appears unused. Verify use (or export if 
used externally).
   ^--^ SC2034: _run appears unused. Verify use (or export 
if used externally).
^--^ SC2034: _pct appears unused. Verify use (or 
export if used externally).

This variable is intentionally unused since it is
needed to parse through the output. commit used "_"
as a prefix for this throw away variable. But this
stil shows warning with shellcheck v0.6. Fix this
by only using "_" instead of prefix and variable name.

Signed-off-by: Athira Rajeev 
---
 tools/perf/tests/shell/stat+csv_summary.sh | 4 ++--
 tools/perf/tests/shell/stat+shadow_stat.sh | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/tests/shell/stat+csv_summary.sh 
b/tools/perf/tests/shell/stat+csv_summary.sh
index 8bae9c8a835e..323123ff4d19 100755
--- a/tools/perf/tests/shell/stat+csv_summary.sh
+++ b/tools/perf/tests/shell/stat+csv_summary.sh
@@ -10,7 +10,7 @@ set -e
 #
 perf stat -e cycles  -x' ' -I1000 --interval-count 1 --summary 2>&1 | \
 grep -e summary | \
-while read summary _num _event _run _pct
+while read summary _ _ _ _
 do
if [ $summary != "summary" ]; then
exit 1
@@ -23,7 +23,7 @@ done
 #
 perf stat -e cycles  -x' ' -I1000 --interval-count 1 --summary 
--no-csv-summary 2>&1 | \
 grep -e summary | \
-while read _num _event _run _pct
+while read _ _ _ _
 do
exit 1
 done
diff --git a/tools/perf/tests/shell/stat+shadow_stat.sh 
b/tools/perf/tests/shell/stat+shadow_stat.sh
index a1918a15e36a..386821462f3c 100755
--- a/tools/perf/tests/shell/stat+shadow_stat.sh
+++ b/tools/perf/tests/shell/stat+shadow_stat.sh
@@ -14,7 +14,7 @@ test_global_aggr()
 {
perf stat -a --no-big-num -e cycles,instructions sleep 1  2>&1 | \
grep -e cycles -e instructions | \
-   while read num evt _hash ipc rest
+   while read num evt _ ipc rest
do
# skip not counted events
if [ "$num" = "&1 | \
grep ^CPU | \
-   while read cpu num evt _hash ipc rest
+   while read cpu num evt _ ipc rest
do
# skip not counted events
if [ "$num" = "

  1   2   3   4   5   6   7   8   >