Re: [PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump
> On 7 May 2024, at 3:05 PM, Christophe Leroy > wrote: > > > > Le 06/05/2024 à 14:19, Athira Rajeev a écrit : >> Add support to capture and parse raw instruction in objdump. > > What's the purpose of using 'objdump' for reading raw instructions ? > Can't they be read directly without invoking 'objdump' ? It looks odd to > me to use objdump to provide readable text and then parse it back. Hi Christophe, Thanks for your review comments. Current implementation for data type profiling on X86 uses "objdump" tool to get the disassembled code. And then the objdump result lines are parsed to get the instruction name and register fields. The initial patchset I posted to enable the data type profiling feature in powerpc was using the same way by getting disassembled code from objdump and parsing the disassembled lines. But in V2, we are introducing change for powerpc to use "raw instruction" and fetch opcode, reg fields from the raw instruction. I tried to explain below that current objdump uses option "--no-show-raw-insn" which doesn't capture raw instruction. So to capture raw instruction, V2 patchset has changes to use default option "--show-raw-insn" and get the raw instruction [ for powerpc ] along with human readable annotation [ which is used by other archs ]. Since perf tool already has objdump implementation in place, I went in the direction to enhance it to use "--show-raw-insn" for powerpc purpose. But as you mentioned, we can directly read raw instruction without using "objdump" tool. perf has support to read object code. The dso open/read utilities and helper functions are already present in "util/dso.c" And "dso__data_read_offset" function reads data from dso file offset. We can use these functions and I can make changes to directly read binary instruction without using objdump. Namhyung, Arnaldo, Christophe Looking for your valuable feedback on this approach. Please suggest if this approach looks fine Thanks Athira > >> Currently, the perf tool infrastructure uses "--no-show-raw-insn" option >> with "objdump" while disassemble. Example from powerpc with this option >> for an instruction address is: > > Yes and that makes sense because the purpose of objdump is to provide > human readable annotations, not to perform automated analysis. Am I > missing something ? > >> >> Snippet from: >> objdump --start-address= --stop-address= -d >> --no-show-raw-insn -C >> >> c10224b4: lwz r10,0(r9) >> >> This line "lwz r10,0(r9)" is parsed to extract instruction name, >> registers names and offset. Also to find whether there is a memory >> reference in the operands, "memory_ref_char" field of objdump is used. >> For x86, "(" is used as memory_ref_char to tackle instructions of the >> form "mov (%rax), %rcx". >> >> In case of powerpc, not all instructions using "(" are the only memory >> instructions. Example, above instruction can also be of extended form (X >> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category >> and extract the source/target registers, patch adds support to use raw >> instruction. With raw instruction, macros are added to extract opcode >> and register fields. >> >> "struct ins_operands" and "struct ins" is updated to carry opcode and >> raw instruction binary code (raw_insn). Function "disasm_line__parse" >> is updated to fill the raw instruction hex value and opcode in newly >> added fields. There is no changes in existing code paths, which parses >> the disassembled code. The architecture using the instruction name and >> present approach is not altered. Since this approach targets powerpc, >> the macro implementation is added for powerpc as of now. >> >> Example: >> representation using --show-raw-insn in objdump gives result: >> >> 38 01 81 e8 ld r4,312(r1) >> >> Here "38 01 81 e8" is the raw instruction representation. In powerpc, >> this translates to instruction form: "ld RT,DS(RA)" and binary code >> as: >> _ >> | 58 | RT | RA | DS | | >> - >> 06 1116 30 31 >> >> Function "disasm_line__parse" is updated to capture: >> >> line:38 01 81 e8 ld r4,312(r1) >> opcode and raw instruction "38 01 81 e8" >> Raw instruction is used later to extract the reg/offset fields. >> >> Signed-off-by: Athira Rajeev >> ---
[PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump
Add support to capture and parse raw instruction in objdump. Currently, the perf tool infrastructure uses "--no-show-raw-insn" option with "objdump" while disassemble. Example from powerpc with this option for an instruction address is: Snippet from: objdump --start-address= --stop-address= -d --no-show-raw-insn -C c10224b4: lwz r10,0(r9) This line "lwz r10,0(r9)" is parsed to extract instruction name, registers names and offset. Also to find whether there is a memory reference in the operands, "memory_ref_char" field of objdump is used. For x86, "(" is used as memory_ref_char to tackle instructions of the form "mov (%rax), %rcx". In case of powerpc, not all instructions using "(" are the only memory instructions. Example, above instruction can also be of extended form (X form) "lwzx r10,0,r19". Inorder to easy identify the instruction category and extract the source/target registers, patch adds support to use raw instruction. With raw instruction, macros are added to extract opcode and register fields. "struct ins_operands" and "struct ins" is updated to carry opcode and raw instruction binary code (raw_insn). Function "disasm_line__parse" is updated to fill the raw instruction hex value and opcode in newly added fields. There is no changes in existing code paths, which parses the disassembled code. The architecture using the instruction name and present approach is not altered. Since this approach targets powerpc, the macro implementation is added for powerpc as of now. Example: representation using --show-raw-insn in objdump gives result: 38 01 81 e8 ld r4,312(r1) Here "38 01 81 e8" is the raw instruction representation. In powerpc, this translates to instruction form: "ld RT,DS(RA)" and binary code as: _ | 58 | RT | RA | DS | | - 06 1116 30 31 Function "disasm_line__parse" is updated to capture: line:38 01 81 e8 ld r4,312(r1) opcode and raw instruction "38 01 81 e8" Raw instruction is used later to extract the reg/offset fields. Signed-off-by: Athira Rajeev --- tools/include/linux/string.h | 2 + tools/lib/string.c| 13 +++ tools/perf/arch/powerpc/util/dwarf-regs.c | 19 ++ tools/perf/util/disasm.c | 46 +++ tools/perf/util/disasm.h | 6 +++ tools/perf/util/include/dwarf-regs.h | 9 + 6 files changed, 88 insertions(+), 7 deletions(-) diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h index db5c99318c79..0acb1fc14e19 100644 --- a/tools/include/linux/string.h +++ b/tools/include/linux/string.h @@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *); extern char *strim(char *); +extern void remove_spaces(char *s); + extern void *memchr_inv(const void *start, int c, size_t bytes); #endif /* _TOOLS_LINUX_STRING_H_ */ diff --git a/tools/lib/string.c b/tools/lib/string.c index 8b6892f959ab..21d273e69951 100644 --- a/tools/lib/string.c +++ b/tools/lib/string.c @@ -153,6 +153,19 @@ char *strim(char *s) return skip_spaces(s); } +/* + * remove_spaces - Removes whitespaces from @s + */ +void remove_spaces(char *s) +{ + char *d = s; + do { + while (*d == ' ') { + ++d; + } + } while ((*s++ = *d++)); +} + /** * strreplace - Replace all occurrences of character in string. * @s: The string to operate on. diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c index 0c4f4caf53ac..e60a71fd846e 100644 --- a/tools/perf/arch/powerpc/util/dwarf-regs.c +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c @@ -98,3 +98,22 @@ int regs_query_register_offset(const char *name) return roff->ptregs_offset; return -EINVAL; } + +#definePPC_OP(op) (((op) >> 26) & 0x3F) +#define PPC_RA(a) (((a) >> 16) & 0x1f) +#define PPC_RT(t) (((t) >> 21) & 0x1f) + +int get_opcode_insn(unsigned int raw_insn) +{ + return PPC_OP(raw_insn); +} + +int get_source_reg(unsigned int raw_insn) +{ + return PPC_RA(raw_insn); +} + +int get_target_reg(unsigned int raw_insn) +{ + return PPC_RT(raw_insn); +} diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c index 2de66a092cab..85692f73e78f 100644 --- a/tools/perf/util/disasm.c +++ b/tools/perf/util/disasm.c @@ -43,7 +43,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops, int max_ins_name); static void ins__sort(struct arch *arch); -static int disasm_line__parse(char *line, const char **namep, char **rawp); +static int disasm_line_
[PATCH V2 9/9] tools/perf: Add support for global_die to capture name of variable in case of register defined variable
In case of register defined variable (found using find_data_type_global_reg), if the type of variable happens to be base type (example, long unsigned int), perf report captures it as: 12.85% long unsigned int long unsigned int +0 (no field) The above data type is actually referring to samples captured while accessing "r1" which represents current stack pointer in powerpc. register void *__stack_pointer asm("r1"); The dwarf debug contains this as: <<>> <1><18dd772>: Abbrev Number: 129 (DW_TAG_variable) <18dd774> DW_AT_name: (indirect string, offset: 0x11ba): current_stack_pointer <18dd778> DW_AT_decl_file : 51 <18dd779> DW_AT_decl_line : 1468 <18dd77b> DW_AT_decl_column : 24 <18dd77c> DW_AT_type: <0x18da5cd> <18dd780> DW_AT_external: 1 <18dd780> DW_AT_location: 1 byte block: 51(DW_OP_reg1 (r1)) where 18da5cd is: <1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type) <18da5ce> DW_AT_byte_size : 8 <18da5cf> DW_AT_encoding: 7 (unsigned) <18da5d0> DW_AT_name: (indirect string, offset: 0x55c7): long unsigned int <<>> To make it more clear to the user, capture the DW_AT_name of the variable and save it as part of Dwarf_Global. Dwarf_Global is used so that it can be used and retrieved while presenting the result. Update "dso__findnew_data_type" function to set "var_name" if variable name is set as part of Dwarf_Global. Updated "hist_entry__typeoff_snprintf" to print var_name if it is set. With the changes, along with "long unsigned int" report also says the variable name as current_stack_pointer Snippet of result: 12.85% long unsigned int long unsigned int +0 (current_stack_pointer) 4.68% struct paca_struct struct paca_struct +2312 (__current) 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask) Signed-off-by: Athira Rajeev --- tools/perf/util/annotate-data.c | 30 -- tools/perf/util/dwarf-aux.c | 1 + tools/perf/util/dwarf-aux.h | 1 + tools/perf/util/sort.c | 7 +-- 4 files changed, 31 insertions(+), 8 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index ab2168c4ef41..9f72d4b6a5f4 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -267,23 +267,32 @@ static void delete_members(struct annotated_member *member) } static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, - Dwarf_Die *type_die) + Dwarf_Die *type_die, Dwarf_Global *global_die) { struct annotated_data_type *result = NULL; struct annotated_data_type key; struct rb_node *node; struct strbuf sb; + struct strbuf sb_var_name; char *type_name; + char *var_name; Dwarf_Word size; strbuf_init(, 32); + strbuf_init(_var_name, 32); if (die_get_typename_from_type(type_die, ) < 0) strbuf_add(, "(unknown type)", 14); + if (global_die->name) { + strbuf_addstr(_var_name, global_die->name); + var_name = strbuf_detach(_var_name, NULL); + } type_name = strbuf_detach(, NULL); dwarf_aggregate_size(type_die, ); /* Check existing nodes in dso->data_types tree */ key.self.type_name = type_name; + if (global_die->name) + key.self.var_name = var_name; key.self.size = size; node = rb_find(, >data_types, data_type_cmp); if (node) { @@ -300,6 +309,8 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso, } result->self.type_name = type_name; + if (global_die->name) + result->self.var_name = var_name; result->self.size = size; INIT_LIST_HEAD(>self.children); @@ -1177,7 +1188,7 @@ static int find_data_type_block(struct data_loc_info *dloc, * cu_die and match with reg to identify data type die. */ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die, - Dwarf_Die *type_die) + Dwarf_Die *type_die, Dwarf_Global *global_die) { Dwarf_Die vr_die; int ret = -1; @@ -1189,8 +1200,11 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_ if (dwarf_offdie(dloc->di->dbg, var_types->die_off, _die)) { if (die_get_real_type(_die, type_die) == NULL) { dloc->type_offset = 0; + global_die->name = var_typ
[PATCH V2 8/9] tools/perf: Add support to find global register variables using find_data_type_global_reg
There are cases where define a global register variable and associate it with a specified register. Example, in powerpc, two registers are defined to represent variable: 1. r13: represents local_paca register struct paca_struct *local_paca asm("r13"); 2. r1: represents stack_pointer register void *__stack_pointer asm("r1"); These regs are present in dwarf debug as DW_OP_reg as part of variables in the cu_die (compile unit). These are not present in die search done in the list of nested scopes since these are global register variables. Example for local_paca represented by r13: <<>> <1><18dc6b4>: Abbrev Number: 128 (DW_TAG_variable) <18dc6b6> DW_AT_name: (indirect string, offset: 0x3861): local_paca <18dc6ba> DW_AT_decl_file : 48 <18dc6bb> DW_AT_decl_line : 36 <18dc6bc> DW_AT_decl_column : 30 <18dc6bd> DW_AT_type: <0x18dc6c3> <18dc6c1> DW_AT_external: 1 <18dc6c1> DW_AT_location: 1 byte block: 5d(DW_OP_reg13 (r13)) <1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type) <18dc6c4> DW_AT_byte_size : 8 <18dc6c4> DW_AT_type: <0x18dc353> Where DW_AT_type : <0x18dc6c3> further points to : <1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type) <18dc6c4> DW_AT_byte_size : 8 <18dc6c4> DW_AT_type: <0x18dc353> which belongs to: <1><18dc353>: Abbrev Number: 67 (DW_TAG_structure_type) <18dc354> DW_AT_name: (indirect string, offset: 0x56cd): paca_struct <18dc358> DW_AT_byte_size : 2944 <18dc35a> DW_AT_alignment : 128 <18dc35b> DW_AT_decl_file : 48 <18dc35c> DW_AT_decl_line : 61 <18dc35d> DW_AT_decl_column : 8 <18dc35d> DW_AT_sibling : <0x18dc6b4> <<>> Similar is case with "r1". <<>> <1><18dd772>: Abbrev Number: 129 (DW_TAG_variable) <18dd774> DW_AT_name: (indirect string, offset: 0x11ba): current_stack_pointer <18dd778> DW_AT_decl_file : 51 <18dd779> DW_AT_decl_line : 1468 <18dd77b> DW_AT_decl_column : 24 <18dd77c> DW_AT_type: <0x18da5cd> <18dd780> DW_AT_external: 1 <18dd780> DW_AT_location: 1 byte block: 51(DW_OP_reg1 (r1)) where 18da5cd is: <1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type) <18da5ce> DW_AT_byte_size : 8 <18da5cf> DW_AT_encoding: 7 (unsigned) <18da5d0> DW_AT_name: (indirect string, offset: 0x55c7): long unsigned int <<>> To identify data type for these two special cases, iterate over variables in the CU die (Compile Unit) and match it with the register. If the variable is a base type, ie die_get_real_type will return NULL here, set offset to zero. With the changes, data type for "paca_struct" and "long unsigned int" for r1 is identified. Snippet from ./perf report -s type,type_off 12.85% long unsigned int long unsigned int +0 (no field) 4.68% struct paca_struct struct paca_struct +2312 (__current) 4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask) Signed-off-by: Athira Rajeev --- tools/perf/util/annotate-data.c | 40 tools/perf/util/annotate.c | 8 ++ tools/perf/util/annotate.h | 1 + tools/perf/util/include/dwarf-regs.h | 1 + 4 files changed, 50 insertions(+) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index e22ba35c93b2..ab2168c4ef41 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -1169,6 +1169,40 @@ static int find_data_type_block(struct data_loc_info *dloc, return ret; } +/* + * Handle cases where define a global register variable and + * associate it with a specified register. These regs are + * present in dwarf debug as DW_OP_reg as part of variables + * in the cu_die (compile unit). Iterate over variables in the + * cu_die and match with reg to identify data type die. + */ +static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die, + Dwarf_Die *type_die) +{ + Dwarf_Die vr_die; + int ret = -1; + struct die_var_type *var_types = NULL; + + die_collect_vars(cu_die, _types); + while (var_types) { + if (var_types->reg == reg) { + if (dwarf_offdie(dloc->di->dbg, var_types->die_off, _die)) { + if (die_get_real_type(_die, type_die) == NULL) { + dloc->type_offset = 0; + dwarf_o
[PATCH V2 7/9] tools/perf: Update instruction tracking with add instruction
Update instruction tracking with add instruction. Apart from "mr" instruction, the register state is carried on by other insns, ie, "add, addi, addis". Since these are not memory instructions and doesn't fall in the range of (32 to 63), add these as part of nmemonic table. For now, add* instructions are added. There is possibility of getting more added here. But to extract regs, still the binary code will be used. So associate this with "load_store_ops" itself and no other changes required. Signed-off-by: Athira Rajeev --- .../perf/arch/powerpc/annotate/instructions.c | 21 +++ tools/perf/util/disasm.c | 1 + 2 files changed, 22 insertions(+) diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c index cce7023951fe..1f35d8a65bb4 100644 --- a/tools/perf/arch/powerpc/annotate/instructions.c +++ b/tools/perf/arch/powerpc/annotate/instructions.c @@ -1,6 +1,17 @@ // SPDX-License-Identifier: GPL-2.0 #include +/* + * powerpc instruction nmemonic table to associate load/store instructions with + * move_ops. mov_ops is used to identify add/mr to do instruction tracking. + */ +static struct ins powerpc__instructions[] = { + { .name = "mr", .ops = _store_ops, }, + { .name = "addi", .ops = _store_ops, }, + { .name = "addis", .ops = _store_ops, }, + { .name = "add",.ops = _store_ops, }, +}; + static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, const char *name) { int i; @@ -75,6 +86,9 @@ static void update_insn_state_powerpc(struct type_state *state, if (annotate_get_insn_location(dloc->arch, dl, ) < 0) return; + if (!strncmp(dl->ins.name, "add", 3)) + goto regs_check; + if (strncmp(dl->ins.name, "mr", 2)) return; @@ -85,6 +99,7 @@ static void update_insn_state_powerpc(struct type_state *state, dst->reg1 = src_reg; } +regs_check: if (!has_reg_type(state, dst->reg1)) return; @@ -115,6 +130,12 @@ static void update_insn_state_powerpc(struct type_state *state __maybe_unused, s static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused) { if (!arch->initialized) { + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions); + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins)); + if (!arch->instructions) + return -ENOMEM; + memcpy(arch->instructions, powerpc__instructions, sizeof(struct ins) * arch->nr_instructions); + arch->nr_instructions_allocated = arch->nr_instructions; arch->initialized = true; arch->associate_instruction_ops = powerpc__associate_instruction_ops; arch->objdump.comment_char = '#'; diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c index ac6b8b8da38a..32cf506a9010 100644 --- a/tools/perf/util/disasm.c +++ b/tools/perf/util/disasm.c @@ -36,6 +36,7 @@ static struct ins_ops mov_ops; static struct ins_ops nop_ops; static struct ins_ops lock_ops; static struct ins_ops ret_ops; +static struct ins_ops load_store_ops; static int jump__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops, int max_ins_name); -- 2.43.0
[PATCH V2 6/9] tools/perf: Update instruction tracking for powerpc
Add instruction tracking function "update_insn_state_powerpc" for powerpc. Example sequence in powerpc: ld r10,264(r3) mr r31,r3 < ld r9,312(r31) Consider ithe sample is pointing to: "ld r9,312(r31)". Here the memory reference is hit at "312(r31)" where 312 is the offset and r31 is the source register. Previous instruction sequence shows that register state of r3 is moved to r31. So to identify the data type for r31 access, the previous instruction ("mr") needs to be tracked and the state type entry has to be updated. Current instruction tracking support in perf tools infrastructure is specific to x86. Patch adds this for powerpc and adds "mr" instruction to be tracked. Signed-off-by: Athira Rajeev --- .../perf/arch/powerpc/annotate/instructions.c | 63 +++ tools/perf/util/annotate-data.c | 9 ++- tools/perf/util/disasm.c | 1 + 3 files changed, 72 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c index a3f423c27cae..cce7023951fe 100644 --- a/tools/perf/arch/powerpc/annotate/instructions.c +++ b/tools/perf/arch/powerpc/annotate/instructions.c @@ -49,6 +49,69 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con return ops; } +/* + * Instruction tracking function to track register state moves. + * Example sequence: + *ld r10,264(r3) + *mr r31,r3 + *< + *ld r9,312(r31) + * + * Previous instruction sequence shows that register state of r3 + * is moved to r31. update_insn_state_powerpc tracks these state + * changes + */ +#ifdef HAVE_DWARF_SUPPORT +static void update_insn_state_powerpc(struct type_state *state, + struct data_loc_info *dloc, Dwarf_Die *cu_die __maybe_unused, + struct disasm_line *dl) +{ + struct annotated_insn_loc loc; + struct annotated_op_loc *src = [INSN_OP_SOURCE]; + struct annotated_op_loc *dst = [INSN_OP_TARGET]; + struct type_state_reg *tsr; + u32 insn_offset = dl->al.offset; + + if (annotate_get_insn_location(dloc->arch, dl, ) < 0) + return; + + if (strncmp(dl->ins.name, "mr", 2)) + return; + + if (!strncmp(dl->ins.name, "mr", 2)) { + int src_reg = src->reg1; + + src->reg1 = dst->reg1; + dst->reg1 = src_reg; + } + + if (!has_reg_type(state, dst->reg1)) + return; + + tsr = >regs[dst->reg1]; + + if (!has_reg_type(state, src->reg1) || + !state->regs[src->reg1].ok) { + tsr->ok = false; + return; + } + + tsr->type = state->regs[src->reg1].type; + tsr->kind = state->regs[src->reg1].kind; + tsr->ok = true; + + pr_debug("mov [%x] reg%d -> reg%d", + insn_offset, src->reg1, dst->reg1); + pr_debug_type_name(>type, tsr->kind); +} +#else /* HAVE_DWARF_SUPPORT */ +static void update_insn_state_powerpc(struct type_state *state __maybe_unused, struct data_loc_info *dloc __maybe_unused, + Dwarf_Die *cu_die __maybe_unused, struct disasm_line *dl __maybe_unused) +{ + return; +} +#endif /* HAVE_DWARF_SUPPORT */ + static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused) { if (!arch->initialized) { diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 9d6d4f472c85..e22ba35c93b2 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -1079,6 +1079,13 @@ static int find_data_type_insn(struct data_loc_info *dloc, return ret; } +static int arch_supports_insn_tracking(struct data_loc_info *dloc) +{ + if ((arch__is(dloc->arch, "x86")) || (arch__is(dloc->arch, "powerpc"))) + return 1; + return 0; +} + /* * Construct a list of basic blocks for each scope with variables and try to find * the data type by updating a type state table through instructions. @@ -1093,7 +1100,7 @@ static int find_data_type_block(struct data_loc_info *dloc, int ret = -1; /* TODO: other architecture support */ - if (!arch__is(dloc->arch, "x86")) + if (!arch_supports_insn_tracking(dloc)) return -1; prev_dst_ip = dst_ip = dloc->ip; diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c index f41a0fadeab4..ac6b8b8da38a 100644 --- a/tools/perf/util/disasm.c +++ b/tools/perf/util/disasm.c @@ -151,6 +151,7 @@ static struct arch architectures[] = { { .name = "powerpc", .init = powerpc__annotate_init, + .update_insn_state = update_insn_state_powerpc, }, { .name = "riscv64", -- 2.43.0
[PATCH V2 5/9] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc
Use the raw instruction code and macros to identify memory instructions, extract register fields and also offset. The implementation addresses the D-form, X-form, DS-form instructions. Two main functions are added. New parse function "load_store__parse" as instruction ops parser for memory instructions. Unlink other parser (like mov__parse), this parser fills in only the "raw" field for source/target and new added "mem_ref" field. This is because, here there is no need to parse the disassembled code and arch specific macros will take care of extracting offset and regs which is easier and will be precise. In powerpc, all instructions with a primary opcode from 32 to 63 are memory instructions. Update "ins__find" function to have "opcode" also as a parameter. Don't use the "extract_reg_offset", instead use newly added function "get_arch_regs" which will set these fields: reg1, reg2, offset depending of where it is source or target ops. Signed-off-by: Athira Rajeev --- tools/perf/arch/powerpc/util/dwarf-regs.c | 33 + tools/perf/util/annotate.c| 22 - tools/perf/util/disasm.c | 59 +-- tools/perf/util/disasm.h | 4 +- tools/perf/util/include/dwarf-regs.h | 4 +- 5 files changed, 114 insertions(+), 8 deletions(-) diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c index e60a71fd846e..3121c70dc0d3 100644 --- a/tools/perf/arch/powerpc/util/dwarf-regs.c +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c @@ -102,6 +102,9 @@ int regs_query_register_offset(const char *name) #definePPC_OP(op) (((op) >> 26) & 0x3F) #define PPC_RA(a) (((a) >> 16) & 0x1f) #define PPC_RT(t) (((t) >> 21) & 0x1f) +#define PPC_RB(b) (((b) >> 11) & 0x1f) +#define PPC_D(D) ((D) & 0xfffe) +#define PPC_DS(DS) ((DS) & 0xfffc) int get_opcode_insn(unsigned int raw_insn) { @@ -117,3 +120,33 @@ int get_target_reg(unsigned int raw_insn) { return PPC_RT(raw_insn); } + +int get_offset_opcode(int raw_insn __maybe_unused) +{ + int opcode = PPC_OP(raw_insn); + + /* DS- form */ + if ((opcode == 58) || (opcode == 62)) + return PPC_DS(raw_insn); + else + return PPC_D(raw_insn); +} + +/* + * Fills the required fields for op_loc depending on if it + * is a source of target. + * D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT + * DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT + * X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT + */ +void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused, struct annotated_op_loc *op_loc __maybe_unused) +{ + if (is_source) + op_loc->reg1 = get_source_reg(raw_insn); + else + op_loc->reg1 = get_target_reg(raw_insn); + if (op_loc->multi_regs) + op_loc->reg2 = PPC_RB(raw_insn); + if (op_loc->mem_ref) + op_loc->offset = get_offset_opcode(raw_insn); +} diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 0f5e10654d09..48739c7ffdc7 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2073,6 +2073,12 @@ static int extract_reg_offset(struct arch *arch, const char *str, return 0; } +__weak void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused, + struct annotated_op_loc *op_loc __maybe_unused) +{ + return; +} + /** * annotate_get_insn_location - Get location of instruction * @arch: the architecture info @@ -2117,10 +2123,12 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, for_each_insn_op_loc(loc, i, op_loc) { const char *insn_str = ops->source.raw; bool multi_regs = ops->source.multi_regs; + bool mem_ref = ops->source.mem_ref; if (i == INSN_OP_TARGET) { insn_str = ops->target.raw; multi_regs = ops->target.multi_regs; + mem_ref = ops->target.mem_ref; } /* Invalidate the register by default */ @@ -2130,7 +2138,19 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl, if (insn_str == NULL) continue; - if (strchr(insn_str, arch->objdump.memory_ref_char)) { + /* +* For powerpc, call get_arch_regs function which extracts the +* required fields for op_loc, ie reg1, reg2, offset from the +* raw instruction. +*/ + if (arch__is(arch, "powerpc")) { +
[PATCH V2 2/9] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking
Add "update_insn_state" callback to "struct arch" to handle instruction tracking. Currently updating instruction state is handled by static function "update_insn_state_x86" which is defined in "annotate-data.c". Make this as a callback for specific arch and move to archs specific file "arch/x86/annotate/instructions.c" . This will help to add helper function for other platforms in file: "arch//annotate/instructions.c and make changes/updates easier. Define callback "update_insn_state" as part of "struct arch", also make some of the debug functions non-static so that it can be referenced from other places. Signed-off-by: Athira Rajeev --- tools/perf/arch/x86/annotate/instructions.c | 383 +++ tools/perf/util/annotate-data.c | 391 +--- tools/perf/util/annotate-data.h | 23 ++ tools/perf/util/disasm.c| 2 + tools/perf/util/disasm.h| 7 + 5 files changed, 423 insertions(+), 383 deletions(-) diff --git a/tools/perf/arch/x86/annotate/instructions.c b/tools/perf/arch/x86/annotate/instructions.c index 5cdf457f5cbe..cd2fa59a8034 100644 --- a/tools/perf/arch/x86/annotate/instructions.c +++ b/tools/perf/arch/x86/annotate/instructions.c @@ -206,3 +206,386 @@ static int x86__annotate_init(struct arch *arch, char *cpuid) arch->initialized = true; return err; } + +#ifdef HAVE_DWARF_SUPPORT +static void update_insn_state_x86(struct type_state *state, + struct data_loc_info *dloc, Dwarf_Die *cu_die, + struct disasm_line *dl) +{ + struct annotated_insn_loc loc; + struct annotated_op_loc *src = [INSN_OP_SOURCE]; + struct annotated_op_loc *dst = [INSN_OP_TARGET]; + struct type_state_reg *tsr; + Dwarf_Die type_die; + u32 insn_offset = dl->al.offset; + int fbreg = dloc->fbreg; + int fboff = 0; + + if (annotate_get_insn_location(dloc->arch, dl, ) < 0) + return; + + if (ins__is_call(>ins)) { + struct symbol *func = dl->ops.target.sym; + + if (func == NULL) + return; + + /* __fentry__ will preserve all registers */ + if (!strcmp(func->name, "__fentry__")) + return; + + pr_debug_dtp("call [%x] %s\n", insn_offset, func->name); + + /* Otherwise invalidate caller-saved registers after call */ + for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) { + if (state->regs[i].caller_saved) + state->regs[i].ok = false; + } + + /* Update register with the return type (if any) */ + if (die_find_func_rettype(cu_die, func->name, _die)) { + tsr = >regs[state->ret_reg]; + tsr->type = type_die; + tsr->kind = TSR_KIND_TYPE; + tsr->ok = true; + + pr_debug_dtp("call [%x] return -> reg%d", +insn_offset, state->ret_reg); + pr_debug_type_name(_die, tsr->kind); + } + return; + } + + if (!strncmp(dl->ins.name, "add", 3)) { + u64 imm_value = -1ULL; + int offset; + const char *var_name = NULL; + struct map_symbol *ms = dloc->ms; + u64 ip = ms->sym->start + dl->al.offset; + + if (!has_reg_type(state, dst->reg1)) + return; + + tsr = >regs[dst->reg1]; + + if (src->imm) + imm_value = src->offset; + else if (has_reg_type(state, src->reg1) && +state->regs[src->reg1].kind == TSR_KIND_CONST) + imm_value = state->regs[src->reg1].imm_value; + else if (src->reg1 == DWARF_REG_PC) { + u64 var_addr = annotate_calc_pcrel(dloc->ms, ip, + src->offset, dl); + + if (get_global_var_info(dloc, var_addr, + _name, ) && + !strcmp(var_name, "this_cpu_off") && + tsr->kind == TSR_KIND_CONST) { + tsr->kind = TSR_KIND_PERCPU_BASE; + imm_value = tsr->imm_value; + } + } + else + return; + + if (tsr->kind != TSR_KIND_PERCPU_
[PATCH V2 3/9] tools/perf: Fix a comment about multi_regs in extract_reg_offset function
Fix a comment in function which explains how multi_regs field gets set for an instruction. In the example, "mov %rsi, 8(%rbx,%rcx,4)", the comment mistakenly referred to "dst_multi_regs = 0". Correct it to use "src_multi_regs = 0" Signed-off-by: Athira Rajeev --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index f5b6b5e5e757..0f5e10654d09 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2093,7 +2093,7 @@ static int extract_reg_offset(struct arch *arch, const char *str, * mov 0x18, %r8 # src_reg1 = -1, src_mem = 0 * # dst_reg1 = r8, dst_mem = 0 * - * mov %rsi, 8(%rbx,%rcx,4) # src_reg1 = rsi, src_mem = 0, dst_multi_regs = 0 + * mov %rsi, 8(%rbx,%rcx,4) # src_reg1 = rsi, src_mem = 0, src_multi_regs = 0 * # dst_reg1 = rbx, dst_reg2 = rcx, dst_mem = 1 * # dst_multi_regs = 1, dst_offset = 8 */ -- 2.43.0
[PATCH V2 1/9] tools/perf: Move the data structures related to register type to header file
Data type profiling uses instruction tracking by checking each instruction and updating the register type state in some data structures. This is useful to find the data type in cases when the register state gets transferred from one reg to another. Example, in x86, "mov" instruction and in powerpc, "mr" instruction. Currently these structures are defined in annotate-data.c and instruction tracking is implemented only for x86. Move these data structures to "annotate-data.h" header file so that other arch implementations can use it in arch specific files as well. Signed-off-by: Athira Rajeev --- tools/perf/util/annotate-data.c | 53 +-- tools/perf/util/annotate-data.h | 55 + 2 files changed, 56 insertions(+), 52 deletions(-) diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c index 2c98813f95cd..e812dec09c99 100644 --- a/tools/perf/util/annotate-data.c +++ b/tools/perf/util/annotate-data.c @@ -30,15 +30,6 @@ static void delete_var_types(struct die_var_type *var_types); -enum type_state_kind { - TSR_KIND_INVALID = 0, - TSR_KIND_TYPE, - TSR_KIND_PERCPU_BASE, - TSR_KIND_CONST, - TSR_KIND_POINTER, - TSR_KIND_CANARY, -}; - #define pr_debug_dtp(fmt, ...) \ do { \ if (debug_type_profile) \ @@ -139,49 +130,7 @@ static void pr_debug_location(Dwarf_Die *die, u64 pc, int reg) } } -/* - * Type information in a register, valid when @ok is true. - * The @caller_saved registers are invalidated after a function call. - */ -struct type_state_reg { - Dwarf_Die type; - u32 imm_value; - bool ok; - bool caller_saved; - u8 kind; -}; - -/* Type information in a stack location, dynamically allocated */ -struct type_state_stack { - struct list_head list; - Dwarf_Die type; - int offset; - int size; - bool compound; - u8 kind; -}; - -/* FIXME: This should be arch-dependent */ -#define TYPE_STATE_MAX_REGS 16 - -/* - * State table to maintain type info in each register and stack location. - * It'll be updated when new variable is allocated or type info is moved - * to a new location (register or stack). As it'd be used with the - * shortest path of basic blocks, it only maintains a single table. - */ -struct type_state { - /* state of general purpose registers */ - struct type_state_reg regs[TYPE_STATE_MAX_REGS]; - /* state of stack location */ - struct list_head stack_vars; - /* return value register */ - int ret_reg; - /* stack pointer register */ - int stack_reg; -}; - -static bool has_reg_type(struct type_state *state, int reg) +bool has_reg_type(struct type_state *state, int reg) { return (unsigned)reg < ARRAY_SIZE(state->regs); } diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h index 0a57d9f5ee78..ef235b1b15e1 100644 --- a/tools/perf/util/annotate-data.h +++ b/tools/perf/util/annotate-data.h @@ -6,6 +6,9 @@ #include #include #include +#include "dwarf-aux.h" +#include "annotate.h" +#include "debuginfo.h" struct annotated_op_loc; struct debuginfo; @@ -15,6 +18,15 @@ struct hist_entry; struct map_symbol; struct thread; +enum type_state_kind { + TSR_KIND_INVALID = 0, + TSR_KIND_TYPE, + TSR_KIND_PERCPU_BASE, + TSR_KIND_CONST, + TSR_KIND_POINTER, + TSR_KIND_CANARY, +}; + /** * struct annotated_member - Type of member field * @node: List entry in the parent list @@ -142,6 +154,48 @@ struct annotated_data_stat { }; extern struct annotated_data_stat ann_data_stat; +/* + * Type information in a register, valid when @ok is true. + * The @caller_saved registers are invalidated after a function call. + */ +struct type_state_reg { + Dwarf_Die type; + u32 imm_value; + bool ok; + bool caller_saved; + u8 kind; +}; + +/* Type information in a stack location, dynamically allocated */ +struct type_state_stack { + struct list_head list; + Dwarf_Die type; + int offset; + int size; + bool compound; + u8 kind; +}; + +/* FIXME: This should be arch-dependent */ +#define TYPE_STATE_MAX_REGS 32 + +/* + * State table to maintain type info in each register and stack location. + * It'll be updated when new variable is allocated or type info is moved + * to a new location (register or stack). As it'd be used with the + * shortest path of basic blocks, it only maintains a single table. + */ +struct type_state { + /* state of general purpose registers */ + struct type_state_reg regs[TYPE_STATE_MAX_REGS]; + /* state of stack location */ + struct list_head stack_vars; + /* return value register */ + int r
[PATCH V2 0/9] Add data type profiling support for powerpc
03% struct pt_regs struct pt_regs +264 (user_regs.msr) 1.00% struct menu_device struct menu_device +4 (tick_wakeup) 0.90% struct security_hook_list struct security_hook_list +0 (list.next) 0.76% struct irq_desc struct irq_desc +304 (irq_data.chip) 0.76% struct rq struct rq +2856 (cpu) Thanks Athira Rajeev Changelog: >From v1->v2: - Addressed suggestion from Christophe Leroy and Segher Boessenkool to use the binary code (raw insn) to fetch opcode, register and offset fields. - Added support for instruction tracking in powerpc - Find the register defined variables (r13 and r1 which points to local_paca and current_stack_pointer in powerpc) Athira Rajeev (9): tools/perf: Move the data structures related to register type to header file tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking tools/perf: Fix a comment about multi_regs in extract_reg_offset function tools/perf: Add support to capture and parse raw instruction in objdump tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc tools/perf: Update instruction tracking for powerpc tools/perf: Update instruction tracking with add instruction tools/perf: Add support to find global register variables using find_data_type_global_reg tools/perf: Add support for global_die to capture name of variable in case of register defined variable tools/include/linux/string.h | 2 + tools/lib/string.c| 13 + .../perf/arch/powerpc/annotate/instructions.c | 84 +++ tools/perf/arch/powerpc/util/dwarf-regs.c | 52 ++ tools/perf/arch/x86/annotate/instructions.c | 383 + tools/perf/util/annotate-data.c | 519 +++--- tools/perf/util/annotate-data.h | 78 +++ tools/perf/util/annotate.c| 32 +- tools/perf/util/annotate.h| 1 + tools/perf/util/disasm.c | 109 +++- tools/perf/util/disasm.h | 17 +- tools/perf/util/dwarf-aux.c | 1 + tools/perf/util/dwarf-aux.h | 1 + tools/perf/util/include/dwarf-regs.h | 12 + tools/perf/util/sort.c| 7 +- 15 files changed, 854 insertions(+), 457 deletions(-) -- 2.43.0
Re: [PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc
> On 09-Mar-2024, at 11:13 PM, Segher Boessenkool > wrote: > > All instructions with a primary opcode from 32 to 63 are memory insns, > and no others. It's trivial to see whether it is a load or store, too > (just bit 3 of the insn). Trying to parse disassembled code is much > harder, and you easily make some mistakes here. Hi Segher Thanks for checking the patch and sharing review comments. Ok, I am checking on this part. > > On Sat, Mar 09, 2024 at 12:55:12PM +0530, Athira Rajeev wrote: >> To identify if the instruction has any memory reference, >> "memory_ref_char" field needs to be set for specific architecture. >> Example memory instruction: >> lwz r10,0(r9) >> >> Here "(" is the memory_ref_char. Set this as part of arch->objdump > > What about "lwzx r10,0,r19", semantically exactly the same instruction? > There is no paren in there. Not all instructions using parens are > memory insns, either, not in assembler code at least. Yes, right Segher. So, for the basic foundational patches, I targeted for instructions of this form (D form) There are still samples, which comes as unknown and in that, X form instructions also needs to be checked. Targeted to first get these basic foundational patches to add support in powerpc and get the remaining “unknowns” addressed in follow up. But yes, X-form instructions also will be covered as part of the changes needed for powerpc. > >> To get register number and access offset from the given instruction, >> arch->objdump.register_char is used. In case of powerpc, the register >> char is "r" (since reg names are r0 to r31). Hence set register_char >> as "r". > > cr0..cr7 > r0..r31 > f0..f31 > v0..v31 > vs0..vs63 > and many other spellings. Plain "0..63" is also fine. Ok > > The "0" in my lwzx example is a register field as well (and it stands > for "no register", not for "r0"). Called "(RA|0)" usually (incidentally, > see the parens there as well, oh joy). > > Don't you have the binary code here as well, not just a disassembled > representation of it? It is way easier to just use that, and you'll get > much better results that way :-) > Thanks Segher for the suggestion on this. I will check on this as well. Thanks Athira Rajeev > > Segher
Re: [PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling
> On 09-Mar-2024, at 3:18 PM, Christophe Leroy > wrote: > > > > Le 09/03/2024 à 08:25, Athira Rajeev a écrit : >> Add powerpc instruction nmemonic table to associate load/store >> instructions with move_ops. mov_ops is used to identify mem_type >> to associate instruction with data type and offset. Also initialize >> and allocate arch specific fields for nr_instructions, instructions and >> nr_instructions_allocate. >> >> Signed-off-by: Athira Rajeev >> --- >> .../perf/arch/powerpc/annotate/instructions.c | 66 +++ >> 1 file changed, 66 insertions(+) >> >> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c >> b/tools/perf/arch/powerpc/annotate/instructions.c >> index a3f423c27cae..07af4442be38 100644 >> --- a/tools/perf/arch/powerpc/annotate/instructions.c >> +++ b/tools/perf/arch/powerpc/annotate/instructions.c >> @@ -1,6 +1,65 @@ >> // SPDX-License-Identifier: GPL-2.0 >> #include >> >> +/* >> + * powerpc instruction nmemonic table to associate load/store instructions >> with >> + * move_ops. mov_ops is used to identify mem_type to associate instruction >> with >> + * data type and offset. >> + */ >> +static struct ins powerpc__instructions[] = { >> + { .name = "lbz", .ops = _ops, }, >> + { .name = "lbzx", .ops = _ops, }, >> + { .name = "lbzu", .ops = _ops, }, >> + { .name = "lbzux", .ops = _ops, }, >> + { .name = "lhz", .ops = _ops, }, >> + { .name = "lhzx", .ops = _ops, }, >> + { .name = "lhzu", .ops = _ops, }, >> + { .name = "lhzux", .ops = _ops, }, >> + { .name = "lha", .ops = _ops, }, >> + { .name = "lhax", .ops = _ops, }, >> + { .name = "lhau", .ops = _ops, }, >> + { .name = "lhaux", .ops = _ops, }, >> + { .name = "lwz", .ops = _ops, }, >> + { .name = "lwzx", .ops = _ops, }, >> + { .name = "lwzu", .ops = _ops, }, >> + { .name = "lwzux", .ops = _ops, }, >> + { .name = "lwa", .ops = _ops, }, >> + { .name = "lwax", .ops = _ops, }, >> + { .name = "lwaux", .ops = _ops, }, >> + { .name = "ld", .ops = _ops, }, >> + { .name = "ldx", .ops = _ops, }, >> + { .name = "ldu", .ops = _ops, }, >> + { .name = "ldux", .ops = _ops, }, >> + { .name = "stb", .ops = _ops, }, >> + { .name = "stbx", .ops = _ops, }, >> + { .name = "stbu", .ops = _ops, }, >> + { .name = "stbux", .ops = _ops, }, >> + { .name = "sth", .ops = _ops, }, >> + { .name = "sthx", .ops = _ops, }, >> + { .name = "sthu", .ops = _ops, }, >> + { .name = "sthux", .ops = _ops, }, >> + { .name = "stw", .ops = _ops, }, >> + { .name = "stwx", .ops = _ops, }, >> + { .name = "stwu", .ops = _ops, }, >> + { .name = "stwux", .ops = _ops, }, >> + { .name = "std", .ops = _ops, }, >> + { .name = "stdx", .ops = _ops, }, >> + { .name = "stdu", .ops = _ops, }, >> + { .name = "stdux", .ops = _ops, }, >> + { .name = "lhbrx", .ops = _ops, }, >> + { .name = "sthbrx", .ops = _ops, }, >> + { .name = "lwbrx", .ops = _ops, }, >> + { .name = "stwbrx", .ops = _ops, }, >> + { .name = "ldbrx", .ops = _ops, }, >> + { .name = "stdbrx", .ops = _ops, }, >> + { .name = "lmw", .ops = _ops, }, >> + { .name = "stmw", .ops = _ops, }, >> + { .name = "lswi", .ops = _ops, }, >> + { .name = "lswx", .ops = _ops, }, >> + { .name = "stswi", .ops = _ops, }, >> + { .name = "stswx", .ops = _ops, }, >> +}; > > What about lwarx and stwcx ? Yes, Will add those in next version > >> + >> static struct ins_ops *powerpc__associate_instruction_ops(struct arch >> *arch, const char *name) >> { >> int i; >> @@ -52,6 +111,13 @@ static struct ins_ops >> *powerpc__associate_instruction_ops(struct arch *arch, con >> static int powerpc__annotate_init(struct arch *arch, char *cpuid >> __maybe_unused) >> { >> if (!arch->initialized) { >> + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions); >> + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins)); >> + if (arch->instructions == NULL) > > Prefered form is > > if (!arch->instructions) Ok , will make this change > >> + return -ENOMEM; >> + >> + memcpy(arch->instructions, (struct ins *)powerpc__instructions, >> sizeof(struct ins) * arch->nr_instructions); > > No need to cast powerpc__instructions, it is already a pointer. Yes, I will correct it Thanks Athira Rajeev > > >> + arch->nr_instructions_allocated = arch->nr_instructions; >> arch->initialized = true; >> arch->associate_instruction_ops = powerpc__associate_instruction_ops; >> arch->objdump.comment_char = '#';
Re: [PATCH 3/3] tools/perf/arch/powerc: Add get_arch_regnum for powerpc
> On 09-Mar-2024, at 3:24 PM, Christophe Leroy > wrote: > > > > Le 09/03/2024 à 08:25, Athira Rajeev a écrit : >> The function get_dwarf_regnum() returns a DWARF register number >> from a register name string. This calls arch specific function >> get_arch_regnum to return register number for corresponding arch. >> Add mappings for register name to register number in powerpc code: >> arch/powerpc/util/dwarf-regs.c >> >> Signed-off-by: Athira Rajeev >> --- >> tools/perf/arch/powerpc/util/dwarf-regs.c | 29 +++ >> 1 file changed, 29 insertions(+) >> >> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c >> b/tools/perf/arch/powerpc/util/dwarf-regs.c >> index 0c4f4caf53ac..d955e3e577ea 100644 >> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c >> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c >> @@ -98,3 +98,32 @@ int regs_query_register_offset(const char *name) >> return roff->ptregs_offset; >> return -EINVAL; >> } >> + >> +struct dwarf_regs_idx { >> + const char *name; >> + int idx; >> +}; >> + >> +static const struct dwarf_regs_idx powerpc_regidx_table[] = { >> + { "r0", 0 }, { "r1", 1 }, { "r2", 2 }, { "r3", 3 }, { "r4", 4 }, >> + { "r5", 5 }, { "r6", 6 }, { "r7", 7 }, { "r8", 8 }, { "r9", 9 }, >> + { "r10", 10 }, { "r11", 11 }, { "r12", 12 }, { "r13", 13 }, { "r14", 14 }, >> + { "r15", 15 }, { "r16", 16 }, { "r17", 17 }, { "r18", 18 }, { "r19", 19 }, >> + { "r20", 20 }, { "r21", 21 }, { "r22", 22 }, { "r23", 23 }, { "r24", 24 }, >> + { "r25", 25 }, { "r26", 26 }, { "r27", 27 }, { "r27", 27 }, { "r28", 28 }, >> + { "r29", 29 }, { "r30", 30 }, { "r31", 31 }, >> +}; >> + >> +int get_arch_regnum(const char *name) >> +{ >> + unsigned int i; >> + >> + if (*name != 'r') >> + return -EINVAL; >> + >> + for (i = 0; i < ARRAY_SIZE(powerpc_regidx_table); i++) >> + if (!strcmp(powerpc_regidx_table[i].name, name)) >> + return powerpc_regidx_table[i].idx; > > Can you do more simple ? > > Something like: > > int n; > > if (*name != 'r') > return -EINVAL; > n = atoi(name + 1); > return n >= 0 && n < 32 ? n : -ENOENT; Hi Christophe, Thanks for reviewing patch and for the suggestions. Sure, I will check this approach and address in V2 Thanks Athira > >> + >> + return -ENOENT; >> +}
[PATCH 3/3] tools/perf/arch/powerc: Add get_arch_regnum for powerpc
The function get_dwarf_regnum() returns a DWARF register number from a register name string. This calls arch specific function get_arch_regnum to return register number for corresponding arch. Add mappings for register name to register number in powerpc code: arch/powerpc/util/dwarf-regs.c Signed-off-by: Athira Rajeev --- tools/perf/arch/powerpc/util/dwarf-regs.c | 29 +++ 1 file changed, 29 insertions(+) diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c index 0c4f4caf53ac..d955e3e577ea 100644 --- a/tools/perf/arch/powerpc/util/dwarf-regs.c +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c @@ -98,3 +98,32 @@ int regs_query_register_offset(const char *name) return roff->ptregs_offset; return -EINVAL; } + +struct dwarf_regs_idx { + const char *name; + int idx; +}; + +static const struct dwarf_regs_idx powerpc_regidx_table[] = { + { "r0", 0 }, { "r1", 1 }, { "r2", 2 }, { "r3", 3 }, { "r4", 4 }, + { "r5", 5 }, { "r6", 6 }, { "r7", 7 }, { "r8", 8 }, { "r9", 9 }, + { "r10", 10 }, { "r11", 11 }, { "r12", 12 }, { "r13", 13 }, { "r14", 14 }, + { "r15", 15 }, { "r16", 16 }, { "r17", 17 }, { "r18", 18 }, { "r19", 19 }, + { "r20", 20 }, { "r21", 21 }, { "r22", 22 }, { "r23", 23 }, { "r24", 24 }, + { "r25", 25 }, { "r26", 26 }, { "r27", 27 }, { "r27", 27 }, { "r28", 28 }, + { "r29", 29 }, { "r30", 30 }, { "r31", 31 }, +}; + +int get_arch_regnum(const char *name) +{ + unsigned int i; + + if (*name != 'r') + return -EINVAL; + + for (i = 0; i < ARRAY_SIZE(powerpc_regidx_table); i++) + if (!strcmp(powerpc_regidx_table[i].name, name)) + return powerpc_regidx_table[i].idx; + + return -ENOENT; +} -- 2.43.0
[PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc
To identify if the instruction has any memory reference, "memory_ref_char" field needs to be set for specific architecture. Example memory instruction: lwz r10,0(r9) Here "(" is the memory_ref_char. Set this as part of arch->objdump To get register number and access offset from the given instruction, arch->objdump.register_char is used. In case of powerpc, the register char is "r" (since reg names are r0 to r31). Hence set register_char as "r". Signed-off-by: Athira Rajeev --- tools/perf/util/annotate.c | 5 + 1 file changed, 5 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ac002d907d81..d69bd6edafcb 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -216,6 +216,11 @@ static struct arch architectures[] = { { .name = "powerpc", .init = powerpc__annotate_init, + .objdump = { + .comment_char = '#', + .register_char = 'r', + .memory_ref_char = '(', + }, }, { .name = "riscv64", -- 2.43.0
[PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling
Add powerpc instruction nmemonic table to associate load/store instructions with move_ops. mov_ops is used to identify mem_type to associate instruction with data type and offset. Also initialize and allocate arch specific fields for nr_instructions, instructions and nr_instructions_allocate. Signed-off-by: Athira Rajeev --- .../perf/arch/powerpc/annotate/instructions.c | 66 +++ 1 file changed, 66 insertions(+) diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c index a3f423c27cae..07af4442be38 100644 --- a/tools/perf/arch/powerpc/annotate/instructions.c +++ b/tools/perf/arch/powerpc/annotate/instructions.c @@ -1,6 +1,65 @@ // SPDX-License-Identifier: GPL-2.0 #include +/* + * powerpc instruction nmemonic table to associate load/store instructions with + * move_ops. mov_ops is used to identify mem_type to associate instruction with + * data type and offset. + */ +static struct ins powerpc__instructions[] = { + { .name = "lbz",.ops = _ops, }, + { .name = "lbzx", .ops = _ops, }, + { .name = "lbzu", .ops = _ops, }, + { .name = "lbzux", .ops = _ops, }, + { .name = "lhz",.ops = _ops, }, + { .name = "lhzx", .ops = _ops, }, + { .name = "lhzu", .ops = _ops, }, + { .name = "lhzux", .ops = _ops, }, + { .name = "lha",.ops = _ops, }, + { .name = "lhax", .ops = _ops, }, + { .name = "lhau", .ops = _ops, }, + { .name = "lhaux", .ops = _ops, }, + { .name = "lwz",.ops = _ops, }, + { .name = "lwzx", .ops = _ops, }, + { .name = "lwzu", .ops = _ops, }, + { .name = "lwzux", .ops = _ops, }, + { .name = "lwa",.ops = _ops, }, + { .name = "lwax", .ops = _ops, }, + { .name = "lwaux", .ops = _ops, }, + { .name = "ld", .ops = _ops, }, + { .name = "ldx",.ops = _ops, }, + { .name = "ldu",.ops = _ops, }, + { .name = "ldux", .ops = _ops, }, + { .name = "stb",.ops = _ops, }, + { .name = "stbx", .ops = _ops, }, + { .name = "stbu", .ops = _ops, }, + { .name = "stbux", .ops = _ops, }, + { .name = "sth",.ops = _ops, }, + { .name = "sthx", .ops = _ops, }, + { .name = "sthu", .ops = _ops, }, + { .name = "sthux", .ops = _ops, }, + { .name = "stw",.ops = _ops, }, + { .name = "stwx", .ops = _ops, }, + { .name = "stwu", .ops = _ops, }, + { .name = "stwux", .ops = _ops, }, + { .name = "std",.ops = _ops, }, + { .name = "stdx", .ops = _ops, }, + { .name = "stdu", .ops = _ops, }, + { .name = "stdux", .ops = _ops, }, + { .name = "lhbrx", .ops = _ops, }, + { .name = "sthbrx", .ops = _ops, }, + { .name = "lwbrx", .ops = _ops, }, + { .name = "stwbrx", .ops = _ops, }, + { .name = "ldbrx", .ops = _ops, }, + { .name = "stdbrx", .ops = _ops, }, + { .name = "lmw",.ops = _ops, }, + { .name = "stmw", .ops = _ops, }, + { .name = "lswi", .ops = _ops, }, + { .name = "lswx", .ops = _ops, }, + { .name = "stswi", .ops = _ops, }, + { .name = "stswx", .ops = _ops, }, +}; + static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, const char *name) { int i; @@ -52,6 +111,13 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused) { if (!arch->initialized) { + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions); + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins)); + if (arch->instructions == NULL) + return -ENOMEM; + + memcpy(arch->instructions, (struct ins *)powerpc__instructions, sizeof(struct ins) * arch->nr_instructions); + arch->nr_instructions_allocated = arch->nr_instructions; arch->initialized = true; arch->associate_instruction_ops = powerpc__associate_instruction_ops; arch->objdump.comment_char = '#'; -- 2.43.0
[PATCH 0/3] Add data type profiling support for powerpc
The patchset from Namhyung added support for data type profiling in perf tool. This enabled support to associate PMU samples to data types they refer using DWARF debug information. With the upstream perf, currently it possible to run perf report or perf annotate to view the data type information on x86. This patchset includes changes need to enable data type profiling support for powerpc. Main change are: 1. powerpc instruction nmemonic table to associate load/store instructions with move_ops which is use to identify if instruction is a memory access one. 2. To get register number and access offset from the given instruction, code uses fields from "struct arch" -> objump. Add entry for powerpc here. 3. A get_arch_regnum to return register number from the register name string. These three patches are the basic foundational patches. With these changes, data types is getting identified for kernel and user-space samples. There are still samples, which comes as "unknown" and needs to be checked. We plan to get those addressed in follow up. With the current patchset: ./perf record -a -e mem-loads sleep 1 ./perf report -s type,typeoff --hierarchy --group --stdio Snippet of logs: Total Lost Samples: 0 Samples: 277 of events 'mem-loads, dummy:u' Event count (approx.): 149813 Overhead Data Type / Data Type Offset ... 65.93% 0.00% (unknown) 65.93% 0.00% (unknown) +0 (no field) 8.19% 0.00% struct vm_area_struct 8.19% 0.00% struct vm_area_struct +136 (vm_file) 4.53% 0.00% struct rq 3.14% 0.00% struct rq +0 (__lock.raw_lock.val) 0.83% 0.00% struct rq +3216 (avg_irq.runnable_sum) 0.24% 0.00% struct rq +4 (nr_running) 0.14% 0.00% struct rq +12 (nr_preferred_running) 0.12% 0.00% struct rq +2760 (sd) 0.06% 0.00% struct rq +3368 (prev_steal_time_rq) 0.01% 0.00% struct rq +2592 (curr) 3.53% 0.00% struct rb_node 3.53% 0.00% struct rb_node +0 (__rb_parent_color) 3.43% 0.00% struct slab 3.43% 0.00% struct slab +32 (freelist) 3.30% 0.00% unsigned int 3.30% 0.00% unsigned int +0 (no field) 3.22% 0.00% struct vm_fault 3.22% 0.00% struct vm_fault +48 (pmd) 2.55% 0.00% unsigned char 2.55% 0.00% unsigned char +0 (no field) 1.06% 0.00% struct task_struct 1.06% 0.00% struct task_struct +4 (thread_info.cpu) 0.92% 0.00% void* 0.92% 0.00% void* +0 (no field) 0.74% 0.00% __int128 unsigned 0.74% 0.00% __int128 unsigned +8 (no field) 0.59% 0.00% struct perf_event 0.54% 0.00% struct perf_event +552 (ctx) 0.04% 0.00% struct perf_event +152 (pmu) 0.20% 0.00% struct sched_entity 0.20% 0.00% struct sched_entity +0 (load.weight) 0.18% 0.00% struct cfs_rq 0.18% 0.00% struct cfs_rq +96 (curr) Thanks Athira Athira Rajeev (3): tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc tools/perf/arch/powerc: Add get_arch_regnum for powerpc .../perf/arch/powerpc/annotate/instructions.c | 66 +++ tools/perf/arch/powerpc/util/dwarf-regs.c | 29 tools/perf/util/annotate.c| 5 ++ 3 files changed, 100 insertions(+) -- 2.43.0
Re: [PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling
Hi All, Please ignore this version. I made mistake in cover letter. I am re-posting the correct version now. Sorry for the confusion Thanks Athira > On 09-Mar-2024, at 11:21 AM, Athira Rajeev > wrote: > > Add powerpc instruction nmemonic table to associate load/store > instructions with move_ops. mov_ops is used to identify mem_type > to associate instruction with data type and offset. Also initialize > and allocate arch specific fields for nr_instructions, instructions and > nr_instructions_allocate. > > Signed-off-by: Athira Rajeev > --- > .../perf/arch/powerpc/annotate/instructions.c | 66 +++ > 1 file changed, 66 insertions(+) > > diff --git a/tools/perf/arch/powerpc/annotate/instructions.c > b/tools/perf/arch/powerpc/annotate/instructions.c > index a3f423c27cae..07af4442be38 100644 > --- a/tools/perf/arch/powerpc/annotate/instructions.c > +++ b/tools/perf/arch/powerpc/annotate/instructions.c > @@ -1,6 +1,65 @@ > // SPDX-License-Identifier: GPL-2.0 > #include > > +/* > + * powerpc instruction nmemonic table to associate load/store instructions > with > + * move_ops. mov_ops is used to identify mem_type to associate instruction > with > + * data type and offset. > + */ > +static struct ins powerpc__instructions[] = { > + { .name = "lbz", .ops = _ops, }, > + { .name = "lbzx", .ops = _ops, }, > + { .name = "lbzu", .ops = _ops, }, > + { .name = "lbzux", .ops = _ops, }, > + { .name = "lhz", .ops = _ops, }, > + { .name = "lhzx", .ops = _ops, }, > + { .name = "lhzu", .ops = _ops, }, > + { .name = "lhzux", .ops = _ops, }, > + { .name = "lha", .ops = _ops, }, > + { .name = "lhax", .ops = _ops, }, > + { .name = "lhau", .ops = _ops, }, > + { .name = "lhaux", .ops = _ops, }, > + { .name = "lwz", .ops = _ops, }, > + { .name = "lwzx", .ops = _ops, }, > + { .name = "lwzu", .ops = _ops, }, > + { .name = "lwzux", .ops = _ops, }, > + { .name = "lwa", .ops = _ops, }, > + { .name = "lwax", .ops = _ops, }, > + { .name = "lwaux", .ops = _ops, }, > + { .name = "ld", .ops = _ops, }, > + { .name = "ldx", .ops = _ops, }, > + { .name = "ldu", .ops = _ops, }, > + { .name = "ldux", .ops = _ops, }, > + { .name = "stb", .ops = _ops, }, > + { .name = "stbx", .ops = _ops, }, > + { .name = "stbu", .ops = _ops, }, > + { .name = "stbux", .ops = _ops, }, > + { .name = "sth", .ops = _ops, }, > + { .name = "sthx", .ops = _ops, }, > + { .name = "sthu", .ops = _ops, }, > + { .name = "sthux", .ops = _ops, }, > + { .name = "stw", .ops = _ops, }, > + { .name = "stwx", .ops = _ops, }, > + { .name = "stwu", .ops = _ops, }, > + { .name = "stwux", .ops = _ops, }, > + { .name = "std", .ops = _ops, }, > + { .name = "stdx", .ops = _ops, }, > + { .name = "stdu", .ops = _ops, }, > + { .name = "stdux", .ops = _ops, }, > + { .name = "lhbrx", .ops = _ops, }, > + { .name = "sthbrx", .ops = _ops, }, > + { .name = "lwbrx", .ops = _ops, }, > + { .name = "stwbrx", .ops = _ops, }, > + { .name = "ldbrx", .ops = _ops, }, > + { .name = "stdbrx", .ops = _ops, }, > + { .name = "lmw", .ops = _ops, }, > + { .name = "stmw", .ops = _ops, }, > + { .name = "lswi", .ops = _ops, }, > + { .name = "lswx", .ops = _ops, }, > + { .name = "stswi", .ops = _ops, }, > + { .name = "stswx", .ops = _ops, }, > +}; > + > static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, > const char *name) > { > int i; > @@ -52,6 +111,13 @@ static struct ins_ops > *powerpc__associate_instruction_ops(struct arch *arch, con > static int powerpc__annotate_init(struct arch *arch, char *cpuid > __maybe_unused) > { > if (!arch->initialized) { > + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions); > + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins)); > + if (arch->instructions == NULL) > + return -ENOMEM; > + > + memcpy(arch->instructions, (struct ins *)powerpc__instructions, > sizeof(struct ins) * arch->nr_instructions); > + arch->nr_instructions_allocated = arch->nr_instructions; > arch->initialized = true; > arch->associate_instruction_ops = powerpc__associate_instruction_ops; > arch->objdump.comment_char = '#'; > -- > 2.43.0 >
[PATCH 3/3] tools/perf/arch/powerc: Add get_arch_regnum for powerpc
The function get_dwarf_regnum() returns a DWARF register number from a register name string. This calls arch specific function get_arch_regnum to return register number for corresponding arch. Add mappings for register name to register number in powerpc code: arch/powerpc/util/dwarf-regs.c Signed-off-by: Athira Rajeev --- tools/perf/arch/powerpc/util/dwarf-regs.c | 29 +++ 1 file changed, 29 insertions(+) diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c index 0c4f4caf53ac..d955e3e577ea 100644 --- a/tools/perf/arch/powerpc/util/dwarf-regs.c +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c @@ -98,3 +98,32 @@ int regs_query_register_offset(const char *name) return roff->ptregs_offset; return -EINVAL; } + +struct dwarf_regs_idx { + const char *name; + int idx; +}; + +static const struct dwarf_regs_idx powerpc_regidx_table[] = { + { "r0", 0 }, { "r1", 1 }, { "r2", 2 }, { "r3", 3 }, { "r4", 4 }, + { "r5", 5 }, { "r6", 6 }, { "r7", 7 }, { "r8", 8 }, { "r9", 9 }, + { "r10", 10 }, { "r11", 11 }, { "r12", 12 }, { "r13", 13 }, { "r14", 14 }, + { "r15", 15 }, { "r16", 16 }, { "r17", 17 }, { "r18", 18 }, { "r19", 19 }, + { "r20", 20 }, { "r21", 21 }, { "r22", 22 }, { "r23", 23 }, { "r24", 24 }, + { "r25", 25 }, { "r26", 26 }, { "r27", 27 }, { "r27", 27 }, { "r28", 28 }, + { "r29", 29 }, { "r30", 30 }, { "r31", 31 }, +}; + +int get_arch_regnum(const char *name) +{ + unsigned int i; + + if (*name != 'r') + return -EINVAL; + + for (i = 0; i < ARRAY_SIZE(powerpc_regidx_table); i++) + if (!strcmp(powerpc_regidx_table[i].name, name)) + return powerpc_regidx_table[i].idx; + + return -ENOENT; +} -- 2.43.0
[PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc
To identify if the instruction has any memory reference, "memory_ref_char" field needs to be set for specific architecture. Example memory instruction: lwz r10,0(r9) Here "(" is the memory_ref_char. Set this as part of arch->objdump To get register number and access offset from the given instruction, arch->objdump.register_char is used. In case of powerpc, the register char is "r" (since reg names are r0 to r31). Hence set register_char as "r". Signed-off-by: Athira Rajeev --- tools/perf/util/annotate.c | 5 + 1 file changed, 5 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ac002d907d81..d69bd6edafcb 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -216,6 +216,11 @@ static struct arch architectures[] = { { .name = "powerpc", .init = powerpc__annotate_init, + .objdump = { + .comment_char = '#', + .register_char = 'r', + .memory_ref_char = '(', + }, }, { .name = "riscv64", -- 2.43.0
[PATCH 1/3] tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling
Add powerpc instruction nmemonic table to associate load/store instructions with move_ops. mov_ops is used to identify mem_type to associate instruction with data type and offset. Also initialize and allocate arch specific fields for nr_instructions, instructions and nr_instructions_allocate. Signed-off-by: Athira Rajeev --- .../perf/arch/powerpc/annotate/instructions.c | 66 +++ 1 file changed, 66 insertions(+) diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c index a3f423c27cae..07af4442be38 100644 --- a/tools/perf/arch/powerpc/annotate/instructions.c +++ b/tools/perf/arch/powerpc/annotate/instructions.c @@ -1,6 +1,65 @@ // SPDX-License-Identifier: GPL-2.0 #include +/* + * powerpc instruction nmemonic table to associate load/store instructions with + * move_ops. mov_ops is used to identify mem_type to associate instruction with + * data type and offset. + */ +static struct ins powerpc__instructions[] = { + { .name = "lbz",.ops = _ops, }, + { .name = "lbzx", .ops = _ops, }, + { .name = "lbzu", .ops = _ops, }, + { .name = "lbzux", .ops = _ops, }, + { .name = "lhz",.ops = _ops, }, + { .name = "lhzx", .ops = _ops, }, + { .name = "lhzu", .ops = _ops, }, + { .name = "lhzux", .ops = _ops, }, + { .name = "lha",.ops = _ops, }, + { .name = "lhax", .ops = _ops, }, + { .name = "lhau", .ops = _ops, }, + { .name = "lhaux", .ops = _ops, }, + { .name = "lwz",.ops = _ops, }, + { .name = "lwzx", .ops = _ops, }, + { .name = "lwzu", .ops = _ops, }, + { .name = "lwzux", .ops = _ops, }, + { .name = "lwa",.ops = _ops, }, + { .name = "lwax", .ops = _ops, }, + { .name = "lwaux", .ops = _ops, }, + { .name = "ld", .ops = _ops, }, + { .name = "ldx",.ops = _ops, }, + { .name = "ldu",.ops = _ops, }, + { .name = "ldux", .ops = _ops, }, + { .name = "stb",.ops = _ops, }, + { .name = "stbx", .ops = _ops, }, + { .name = "stbu", .ops = _ops, }, + { .name = "stbux", .ops = _ops, }, + { .name = "sth",.ops = _ops, }, + { .name = "sthx", .ops = _ops, }, + { .name = "sthu", .ops = _ops, }, + { .name = "sthux", .ops = _ops, }, + { .name = "stw",.ops = _ops, }, + { .name = "stwx", .ops = _ops, }, + { .name = "stwu", .ops = _ops, }, + { .name = "stwux", .ops = _ops, }, + { .name = "std",.ops = _ops, }, + { .name = "stdx", .ops = _ops, }, + { .name = "stdu", .ops = _ops, }, + { .name = "stdux", .ops = _ops, }, + { .name = "lhbrx", .ops = _ops, }, + { .name = "sthbrx", .ops = _ops, }, + { .name = "lwbrx", .ops = _ops, }, + { .name = "stwbrx", .ops = _ops, }, + { .name = "ldbrx", .ops = _ops, }, + { .name = "stdbrx", .ops = _ops, }, + { .name = "lmw",.ops = _ops, }, + { .name = "stmw", .ops = _ops, }, + { .name = "lswi", .ops = _ops, }, + { .name = "lswx", .ops = _ops, }, + { .name = "stswi", .ops = _ops, }, + { .name = "stswx", .ops = _ops, }, +}; + static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, const char *name) { int i; @@ -52,6 +111,13 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused) { if (!arch->initialized) { + arch->nr_instructions = ARRAY_SIZE(powerpc__instructions); + arch->instructions = calloc(arch->nr_instructions, sizeof(struct ins)); + if (arch->instructions == NULL) + return -ENOMEM; + + memcpy(arch->instructions, (struct ins *)powerpc__instructions, sizeof(struct ins) * arch->nr_instructions); + arch->nr_instructions_allocated = arch->nr_instructions; arch->initialized = true; arch->associate_instruction_ops = powerpc__associate_instruction_ops; arch->objdump.comment_char = '#'; -- 2.43.0
[PATCH 0/3] Add data type profiling support for powerpc
From: Akanksha J N The patchset from Namhyung added support for data type profiling in perf tool. This enabled support to associate PMU samples to data types they refer using DWARF debug information. With the upstream perf, currently it possible to run perf report or perf annotate to view the data type information on x86. This patchset includes changes need to enable data type profiling support for powerpc. Main change are: 1. powerpc instruction nmemonic table to associate load/store instructions with move_ops which is use to identify if instruction is a memory access one. 2. To get register number and access offset from the given instruction, code uses fields from "struct arch" -> objump. Add entry for powerpc here. 3. A get_arch_regnum to return register number from the register name string. These three patches are the basic foundational patches. With these changes, data types is getting identified for kernel and user-space samples. There are still samples, which comes as "unknown" and needs to be checked. We plan to get those addressed in follow up. With the current patchset: # ./perf record -a -e mem-loads sleep 1 # ./perf report -s type,typeoff --hierarchy --group --stdio Snippet of logs: # # Total Lost Samples: 0 # # Samples: 277 of events 'mem-loads, dummy:u' # Event count (approx.): 149813 # #Overhead Data Type / Data Type Offset # ... # 65.93% 0.00% (unknown) 65.93% 0.00% (unknown) +0 (no field) 8.19% 0.00% struct vm_area_struct 8.19% 0.00% struct vm_area_struct +136 (vm_file) 4.53% 0.00% struct rq 3.14% 0.00% struct rq +0 (__lock.raw_lock.val) 0.83% 0.00% struct rq +3216 (avg_irq.runnable_sum) 0.24% 0.00% struct rq +4 (nr_running) 0.14% 0.00% struct rq +12 (nr_preferred_running) 0.12% 0.00% struct rq +2760 (sd) 0.06% 0.00% struct rq +3368 (prev_steal_time_rq) 0.01% 0.00% struct rq +2592 (curr) 3.53% 0.00% struct rb_node 3.53% 0.00% struct rb_node +0 (__rb_parent_color) 3.43% 0.00% struct slab 3.43% 0.00% struct slab +32 (freelist) 3.30% 0.00% unsigned int 3.30% 0.00% unsigned int +0 (no field) 3.22% 0.00% struct vm_fault 3.22% 0.00% struct vm_fault +48 (pmd) 2.55% 0.00% unsigned char 2.55% 0.00% unsigned char +0 (no field) 1.06% 0.00% struct task_struct 1.06% 0.00% struct task_struct +4 (thread_info.cpu) 0.92% 0.00% void* 0.92% 0.00% void* +0 (no field) 0.74% 0.00% __int128 unsigned 0.74% 0.00% __int128 unsigned +8 (no field) 0.59% 0.00% struct perf_event 0.54% 0.00% struct perf_event +552 (ctx) 0.04% 0.00% struct perf_event +152 (pmu) 0.20% 0.00% struct sched_entity 0.20% 0.00% struct sched_entity +0 (load.weight) 0.18% 0.00% struct cfs_rq 0.18% 0.00% struct cfs_rq +96 (curr) Thanks Athira Athira Rajeev (3): tools/perf/arch/powerpc: Add load/store in powerpc annotate instructions for data type profling tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc tools/perf/arch/powerc: Add get_arch_regnum for powerpc .../perf/arch/powerpc/annotate/instructions.c | 66 +++ tools/perf/arch/powerpc/util/dwarf-regs.c | 29 tools/perf/util/annotate.c| 5 ++ 3 files changed, 100 insertions(+) -- 2.43.0
Re: [PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 06-Dec-2023, at 3:20 AM, Arnaldo Carvalho de Melo wrote: > > Em Mon, Nov 27, 2023 at 11:12:57AM +, James Clark escreveu: >> On 23/11/2023 16:02, Athira Rajeev wrote: >>> --- a/tools/perf/Makefile.perf >>> @@ -1134,6 +1152,7 @@ bpf-skel-clean: >>> $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) >>> >>> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean >>> $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean >>> bpf-skel-clean tests-coresight-targets-clean >>> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean >>> $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive >>> $(OUTPUT)perf-iostat $(LANG_BINDINGS) >>> $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete >>> -o -name '\.*.d' -delete >>> $(Q)$(RM) $(OUTPUT).config-detected > > While merging perf-tools-next with torvalds/master I noticed that maybe > we better have the above added line as: > > + $(call QUIET_CLEAN, tests) $(Q)$(MAKE) -f > $(srctree)/tools/perf/tests/Makefile.tests clean > > No? > > Anyway I'm merging as-is, but it just hit my eye while merging, > > - Arnaldo Hi Arnaldo As Ian pointed we removed Makefile.tests as part of : https://lore.kernel.org/lkml/20231129213428.2227448-1-irog...@google.com/ Thanks Athira
Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events
> On 05-Dec-2023, at 2:43 AM, Ian Rogers wrote: > > On Mon, Dec 4, 2023 at 12:22 PM Arnaldo Carvalho de Melo > wrote: >> >> Em Mon, Dec 04, 2023 at 05:20:46PM -0300, Arnaldo Carvalho de Melo escreveu: >>> Em Mon, Dec 04, 2023 at 12:12:54PM -0800, Ian Rogers escreveu: >>>> On Thu, Nov 23, 2023 at 8:01 AM Athira Rajeev >>>> wrote: >>>>> >>>>> Running "perf list" on powerpc fails with segfault >>>>> as below: >>>>> >>>>> ./perf list >>>>> Segmentation fault (core dumped) >>>>> >>>>> This happens because of duplicate events in the json list. >>>>> The powerpc Json event list contains some event with same >>>>> event name, but different event code. They are: >>>>> - PM_INST_FROM_L3MISS (Present in datasource and frontend) >>>>> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked) >>>>> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked) >>>>> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked) >>>>> >>>>> pmu_events_table__num_events uses the value from >>>>> table_pmu->num_entries which includes duplicate events as >>>>> well. This causes issue during "perf list" and results in >>>>> segmentation fault. >>>>> >>>>> Since both event codes are valid, append _DSRC to the Data >>>>> Source events (datasource.json), so that they would have a >>>>> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and >>>>> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list >>>>> works as expected. >>>>> >>>>> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events") >>>>> Signed-off-by: Athira Rajeev >>>> >>>> Given duplicate events creates broken pmu-events.c we should capture >>>> that as an exception in jevents.py. That way a JEVENTS_ARCH=all build >>>> will fail if any vendor/architecture would break in this way. We >>>> should also add JEVENTS_ARCH=all to tools/perf/tests/make. Athira, do >>>> you want to look at doing this? >>> >>> Should I go ahead and remove this patch till this is sorted out? >> >> I'll keep it, its already in tmp.perf-tools-next, we can go from there >> and improve this with follow up patches, Thanks Arnaldo for pulling the fix patch. > > Agreed. I could look to do the follow up but likely won't have a > chance for a while. If others could help out it would be great. I'd > like to have the jevents and json be robust enough that we don't trip > over problems like this and the somewhat similar AmpereOne issue. Yes Ian. I will look at adding this with follow up patches and including this as part of tools/perf/tests/make Thanks Athira > > Thanks, > Ian > >> - Arnaldo
Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events
> On 05-Dec-2023, at 1:42 AM, Ian Rogers wrote: > > On Thu, Nov 23, 2023 at 8:01 AM Athira Rajeev > wrote: >> >> Running "perf list" on powerpc fails with segfault >> as below: >> >> ./perf list >> Segmentation fault (core dumped) >> >> This happens because of duplicate events in the json list. >> The powerpc Json event list contains some event with same >> event name, but different event code. They are: >> - PM_INST_FROM_L3MISS (Present in datasource and frontend) >> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked) >> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked) >> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked) >> >> pmu_events_table__num_events uses the value from >> table_pmu->num_entries which includes duplicate events as >> well. This causes issue during "perf list" and results in >> segmentation fault. >> >> Since both event codes are valid, append _DSRC to the Data >> Source events (datasource.json), so that they would have a >> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and >> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list >> works as expected. >> >> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events") >> Signed-off-by: Athira Rajeev > > Given duplicate events creates broken pmu-events.c we should capture > that as an exception in jevents.py. That way a JEVENTS_ARCH=all build > will fail if any vendor/architecture would break in this way. We > should also add JEVENTS_ARCH=all to tools/perf/tests/make. Athira, do > you want to look at doing this? > > Thanks, > Ian Hi Ian, That’s a great suggestion. This will definitely help to capture the issues ahead. I am interested and will work on adding this as part of tools/perf/tests/make Thanks Athira > >> --- >> .../arch/powerpc/power10/datasource.json | 18 ++ >> 1 file changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> index 6b0356f2d301..0eeaaf1a95b8 100644 >> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> @@ -99,6 +99,11 @@ >> "EventName": "PM_INST_FROM_L2MISS", >> "BriefDescription": "The processor's instruction cache was reloaded from >> a source beyond the local core's L2 due to a demand miss." >> }, >> + { >> +"EventCode": "0x0003C000C040", >> +"EventName": "PM_DATA_FROM_L2MISS_DSRC", >> +"BriefDescription": "The processor's L1 data cache was reloaded from a >> source beyond the local core's L2 due to a demand miss." >> + }, >> { >> "EventCode": "0x00038010C040", >> "EventName": "PM_INST_FROM_L2MISS_ALL", >> @@ -161,9 +166,14 @@ >> }, >> { >> "EventCode": "0x00078000C040", >> -"EventName": "PM_INST_FROM_L3MISS", >> +"EventName": "PM_INST_FROM_L3MISS_DSRC", >> "BriefDescription": "The processor's instruction cache was reloaded from >> beyond the local core's L3 due to a demand miss." >> }, >> + { >> +"EventCode": "0x0007C000C040", >> +"EventName": "PM_DATA_FROM_L3MISS_DSRC", >> +"BriefDescription": "The processor's L1 data cache was reloaded from >> beyond the local core's L3 due to a demand miss." >> + }, >> { >> "EventCode": "0x00078010C040", >> "EventName": "PM_INST_FROM_L3MISS_ALL", >> @@ -981,7 +991,7 @@ >> }, >> { >> "EventCode": "0x0003C000C142", >> -"EventName": "PM_MRK_DATA_FROM_L2MISS", >> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC", >> "BriefDescription": "The processor's L1 data cache was reloaded from a >> source beyond the local core's L2 due to a demand miss for a marked >> instruction." >> }, >> { >> @@ -1046,12 +1056,12 @@ >> }, >> { >> "EventCode": "0x00078000C142", >> -"EventName": "PM_MRK_INST_FROM_L3MISS", >> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC", >> "BriefDescription": "The processor's instruction cache was reloaded from >> beyond the local core's L3 due to a demand miss for a marked instruction." >> }, >> { >> "EventCode": "0x0007C000C142", >> -"EventName": "PM_MRK_DATA_FROM_L3MISS", >> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC", >> "BriefDescription": "The processor's L1 data cache was reloaded from >> beyond the local core's L3 due to a demand miss for a marked instruction." >> }, >> { >> -- >> 2.39.3
Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events
> On 29-Nov-2023, at 10:51 AM, Athira Rajeev > wrote: > > > >> On 27-Nov-2023, at 5:32 PM, Disha Goel wrote: >> >> On 23/11/23 9:31 pm, Athira Rajeev wrote: >> >>> Running "perf list" on powerpc fails with segfault >>> as below: >>> >>> ./perf list >>> Segmentation fault (core dumped) >>> >>> This happens because of duplicate events in the json list. >>> The powerpc Json event list contains some event with same >>> event name, but different event code. They are: >>> - PM_INST_FROM_L3MISS (Present in datasource and frontend) >>> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked) >>> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked) >>> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked) >>> >>> pmu_events_table__num_events uses the value from >>> table_pmu->num_entries which includes duplicate events as >>> well. This causes issue during "perf list" and results in >>> segmentation fault. >>> >>> Since both event codes are valid, append _DSRC to the Data >>> Source events (datasource.json), so that they would have a >>> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and >>> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list >>> works as expected. >>> >>> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events") >>> Signed-off-by: Athira Rajeev >> >> I have tested the patch on Power10 machine. Perf list works correctly >> without any segfault now. >> >> # ./perf list >> >> List of pre-defined events (to be used in -e or -M): >> >> branch-instructions OR branches[Hardware event] >> branch-misses [Hardware event] >> >> Tested-by: Disha Goel >> > > Thanks Disha for testing > > Athira Hi Arnaldo, Can we get this pulled in if the patch looks good ? Thanks Athira >>> --- >>> .../arch/powerpc/power10/datasource.json | 18 ++ >>> 1 file changed, 14 insertions(+), 4 deletions(-) >>> >>> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >>> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >>> index 6b0356f2d301..0eeaaf1a95b8 100644 >>> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >>> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >>> @@ -99,6 +99,11 @@ >>> "EventName": "PM_INST_FROM_L2MISS", >>> "BriefDescription": "The processor's instruction cache was reloaded >>> from a source beyond the local core's L2 due to a demand miss." >>> }, >>> + { >>> +"EventCode": "0x0003C000C040", >>> +"EventName": "PM_DATA_FROM_L2MISS_DSRC", >>> +"BriefDescription": "The processor's L1 data cache was reloaded from a >>> source beyond the local core's L2 due to a demand miss." >>> + }, >>> { >>> "EventCode": "0x00038010C040", >>> "EventName": "PM_INST_FROM_L2MISS_ALL", >>> @@ -161,9 +166,14 @@ >>> }, >>> { >>> "EventCode": "0x00078000C040", >>> -"EventName": "PM_INST_FROM_L3MISS", >>> +"EventName": "PM_INST_FROM_L3MISS_DSRC", >>> "BriefDescription": "The processor's instruction cache was reloaded >>> from beyond the local core's L3 due to a demand miss." >>> }, >>> + { >>> +"EventCode": "0x0007C000C040", >>> +"EventName": "PM_DATA_FROM_L3MISS_DSRC", >>> +"BriefDescription": "The processor's L1 data cache was reloaded from >>> beyond the local core's L3 due to a demand miss." >>> + }, >>> { >>> "EventCode": "0x00078010C040", >>> "EventName": "PM_INST_FROM_L3MISS_ALL", >>> @@ -981,7 +991,7 @@ >>> }, >>> { >>> "EventCode": "0x0003C000C142", >>> -"EventName": "PM_MRK_DATA_FROM_L2MISS", >>> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC", >>> "BriefDescription": "The processor's L1 data cache was reloaded from a >>> source beyond the local core's L2 due to a demand miss for a marked >>> instruction." >>> }, >>> { >>> @@ -1046,12 +1056,12 @@ >>> }, >>> { >>> "EventCode": "0x00078000C142", >>> -"EventName": "PM_MRK_INST_FROM_L3MISS", >>> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC", >>> "BriefDescription": "The processor's instruction cache was reloaded >>> from beyond the local core's L3 due to a demand miss for a marked >>> instruction." >>> }, >>> { >>> "EventCode": "0x0007C000C142", >>> -"EventName": "PM_MRK_DATA_FROM_L3MISS", >>> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC", >>> "BriefDescription": "The processor's L1 data cache was reloaded from >>> beyond the local core's L3 due to a demand miss for a marked instruction." >>> }, >>> {
Re: [PATCH] perf vendor events: Update datasource event name to fix duplicate events
> On 27-Nov-2023, at 5:32 PM, Disha Goel wrote: > > On 23/11/23 9:31 pm, Athira Rajeev wrote: > >> Running "perf list" on powerpc fails with segfault >> as below: >> >>./perf list >>Segmentation fault (core dumped) >> >> This happens because of duplicate events in the json list. >> The powerpc Json event list contains some event with same >> event name, but different event code. They are: >> - PM_INST_FROM_L3MISS (Present in datasource and frontend) >> - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked) >> - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked) >> - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked) >> >> pmu_events_table__num_events uses the value from >> table_pmu->num_entries which includes duplicate events as >> well. This causes issue during "perf list" and results in >> segmentation fault. >> >> Since both event codes are valid, append _DSRC to the Data >> Source events (datasource.json), so that they would have a >> unique name. Also add PM_DATA_FROM_L2MISS_DSRC and >> PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list >> works as expected. >> >> Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events") >> Signed-off-by: Athira Rajeev > > I have tested the patch on Power10 machine. Perf list works correctly without > any segfault now. > > # ./perf list > > List of pre-defined events (to be used in -e or -M): > > branch-instructions OR branches[Hardware event] > branch-misses [Hardware event] > > Tested-by: Disha Goel > Thanks Disha for testing Athira >> --- >> .../arch/powerpc/power10/datasource.json | 18 ++ >> 1 file changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> index 6b0356f2d301..0eeaaf1a95b8 100644 >> --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json >> @@ -99,6 +99,11 @@ >> "EventName": "PM_INST_FROM_L2MISS", >> "BriefDescription": "The processor's instruction cache was reloaded >> from a source beyond the local core's L2 due to a demand miss." >>}, >> + { >> +"EventCode": "0x0003C000C040", >> +"EventName": "PM_DATA_FROM_L2MISS_DSRC", >> +"BriefDescription": "The processor's L1 data cache was reloaded from a >> source beyond the local core's L2 due to a demand miss." >> + }, >>{ >> "EventCode": "0x00038010C040", >> "EventName": "PM_INST_FROM_L2MISS_ALL", >> @@ -161,9 +166,14 @@ >>}, >>{ >> "EventCode": "0x00078000C040", >> -"EventName": "PM_INST_FROM_L3MISS", >> +"EventName": "PM_INST_FROM_L3MISS_DSRC", >> "BriefDescription": "The processor's instruction cache was reloaded >> from beyond the local core's L3 due to a demand miss." >>}, >> + { >> +"EventCode": "0x0007C000C040", >> +"EventName": "PM_DATA_FROM_L3MISS_DSRC", >> +"BriefDescription": "The processor's L1 data cache was reloaded from >> beyond the local core's L3 due to a demand miss." >> + }, >>{ >> "EventCode": "0x00078010C040", >> "EventName": "PM_INST_FROM_L3MISS_ALL", >> @@ -981,7 +991,7 @@ >>}, >>{ >> "EventCode": "0x0003C000C142", >> -"EventName": "PM_MRK_DATA_FROM_L2MISS", >> +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC", >> "BriefDescription": "The processor's L1 data cache was reloaded from a >> source beyond the local core's L2 due to a demand miss for a marked >> instruction." >>}, >>{ >> @@ -1046,12 +1056,12 @@ >>}, >>{ >> "EventCode": "0x00078000C142", >> -"EventName": "PM_MRK_INST_FROM_L3MISS", >> +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC", >> "BriefDescription": "The processor's instruction cache was reloaded >> from beyond the local core's L3 due to a demand miss for a marked >> instruction." >>}, >>{ >> "EventCode": "0x0007C000C142", >> -"EventName": "PM_MRK_DATA_FROM_L3MISS", >> +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC", >> "BriefDescription": "The processor's L1 data cache was reloaded from >> beyond the local core's L3 due to a demand miss for a marked instruction." >>}, >>{
Re: [PATCH] perf test record+probe_libc_inet_pton: Fix call chain match on powerpc
> On 26-Nov-2023, at 12:39 PM, Likhitha Korrapati > wrote: > > The perf test "probe libc's inet_pton & backtrace it with ping" fails on > powerpc as below: > > root@xxx perf]# perf test -v "probe libc's inet_pton & backtrace it with > ping" > 85: probe libc's inet_pton & backtrace it with ping : > --- start --- > test child forked, pid 96028 > ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60) > 7fffa1779a60 __GI___inet_pton+0x0 > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > 7fffa172a73c getaddrinfo+0x121c > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > FAIL: expected backtrace entry > "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$" > got "7fffa172a73c getaddrinfo+0x121c > (/usr/lib64/glibc-hwcaps/power10/libc.so.6)" > test child finished with -1 > end > probe libc's inet_pton & backtrace it with ping: FAILED! Reviewed-by: Athira Rajeev Thanks Athira > > This test installs a probe on libc's inet_pton function, which will use > uprobes and then uses perf trace on a ping to localhost. It gets 3 > levels deep backtrace and checks whether it is what we expected or not. > > The test started failing from RHEL 9.4 where as it works in previous > distro version (RHEL 9.2). Test expects gaih_inet function to be part of > backtrace. But in the glibc version (2.34-86) which is part of distro > where it fails, this function is missing and hence the test is failing. > > From nm and ping command output we can confirm that gaih_inet function > is not present in the expected backtrace for glibc version glibc-2.34-86 > > [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet > 001273e0 t gaih_inet_serv > 001cd8d8 r gaih_inet_typeproto > > [root@xxx perf]# perf script -i /tmp/perf.data.6E8 > ping 104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60) >7fff83779a60 __GI___inet_pton+0x0 > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) >7fff8372a73c getaddrinfo+0x121c > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > 11dc73534 [unknown] (/usr/bin/ping) >7fff8362a8c4 __libc_start_call_main+0x84 > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > > FAIL: expected backtrace entry > "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$" > got "7fff9d52a73c getaddrinfo+0x121c > (/usr/lib64/glibc-hwcaps/power10/libc.so.6)" > > With version glibc-2.34-60 gaih_inet function is present as part of the > expected backtrace. So we cannot just remove the gaih_inet function from > the backtrace. > > [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet > 00130490 t gaih_inet.constprop.0 > 0012e830 t gaih_inet_serv > 001d45e4 r gaih_inet_typeproto > > [root@xxx perf]# ./perf script -i /tmp/perf.data.b6S > ping 67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd80820) >7fffbdd80820 __GI___inet_pton+0x0 > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) >7fffbdd31160 gaih_inet.constprop.0+0xcd0 > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) >7fffbdd31c7c getaddrinfo+0x14c > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > 1140d3558 [unknown] (/usr/bin/ping) > > This patch solves this issue by doing a conditional skip. If there is a > gaih_inet function present in the libc then it will be added to the > expected backtrace else the function will be skipped from being added > to the expected backtrace. > > Output with the patch > > [root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it > with ping" > 83: probe libc's inet_pton & backtrace it with ping : > --- start --- > test child forked, pid 102662 > ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60) > 7fff93379a60 __GI___inet_pton+0x0 > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > 7fff9332a73c getaddrinfo+0x121c > (/usr/lib64/glibc-hwcaps/power10/libc.so.6) > 11ef03534 [unknown] (/usr/bin/ping) > test child finished with 0 > end > probe libc's inet_pton & backtrace it with ping: Ok > > Signed-off-by: Likhitha Korrapati > Reported-by: Disha Goel > --- > tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh > b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh > index eebeea6bdc76..72c65570db37 100755 > --- a/tools/perf/tests/shell/record+probe_libc_inet_pton
Re: [PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 27-Nov-2023, at 8:21 PM, Arnaldo Carvalho de Melo wrote: > > Em Mon, Nov 27, 2023 at 11:12:57AM +, James Clark escreveu: >> On 23/11/2023 16:02, Athira Rajeev wrote: >>> Add rule in new Makefile "tests/Makefile.tests" for running >>> shellcheck on shell test scripts. This automates below shellcheck >>> into the build. > >> Seems to work really well. I also tested it on Ubuntu, and checked >> NO_SHELLCHECK, cleaning and with and without shellcheck installed etc. > >> Reviewed-by: James Clark > > Tested on Fedora 38, works as advertised, applied. > > - Arnaldo Hi James, Arnaldo Thanks for testing the patch and comments. Athira Rajeev
[PATCH V4] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
Add rule in new Makefile "tests/Makefile.tests" for running shellcheck on shell test scripts. This automates below shellcheck into the build. $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done Condition for shellcheck is added in Makefile.perf to avoid build breakage in the absence of shellcheck binary. Update Makefile.perf to contain new rule for "SHELLCHECK_TEST" which is for making shellcheck test as a dependency on perf binary. Added "tests/Makefile.tests" to run shellcheck on shellscripts in tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time during make, shellcheck will be run only on modified files during subsequent invocations. By this, if any newly added shell scripts or fixes in existing scripts breaks coding/formatting style, it will get captured during the perf build. Example build failure by modifying probe_vfs_getname.sh in tests/shell: In tests/shell/probe_vfs_getname.sh line 8: . $(dirname $0)/lib/probe.sh ^---^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1 make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2 make[2]: *** Waiting for unfinished jobs make[1]: *** [Makefile.perf:244: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 Here, like other files which gets created during compilation (ex: .builtin-bench.o.cmd or .perf.o.cmd ), create .shellcheck_log also as a hidden file. Example: tests/shell/.probe_vfs_getname.sh.shellcheck_log shellcheck is re-run if any of the script gets modified based on its dependency of this log file. After this, for testing, changed "tests/shell/trace+probe_vfs_getname.sh" to break shellcheck format. In the next make run, it is also captured: In tests/shell/probe_vfs_getname.sh line 8: . $(dirname $0)/lib/probe.sh ^---^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1 make[3]: *** Waiting for unfinished jobs In tests/shell/trace+probe_vfs_getname.sh line 14: . $(dirname $0)/lib/probe.sh ^---^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.trace+probe_vfs_getname.sh.shellcheck_log] Error 1 make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2 make[2]: *** Waiting for unfinished jobs make[1]: *** [Makefile.perf:244: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 Failure log can be found in the stdout of make itself. This is reported at build time. To be able to go ahead with the build or disable shellcheck even though it is known that some test is broken, add a "NO_SHELLCHECK" option. Example: make NO_SHELLCHECK=1 INSTALL libsubcmd_headers INSTALL libsymbol_headers INSTALL libapi_headers INSTALL libperf_headers INSTALL libbpf_headers LINKperf Note: This is tested on RHEL and also SLES. Use below check: "$(shell which shellcheck 2> /dev/null)" to look for presence of shellcheck binary. The approach "shell command -v" is not used here. In some of the distros(RHEL), command is available as executable file (/usr/bin/command). But in some distros(SLES), it is a shell builtin and not available as executable file. Signed-off-by: Athira Rajeev --- changelog: v3 -> v4: Addressed review comments from James Clark. - Made the shellcheck errors to be reported in make output itself during make like any other build error. - Removed creating .dep files. Instead use the log file to determine whether shellcheck has to be re-run when there is a change in source file. - Change log file to have suffix as shellcheck_log so as to differentiate it from test execution log. - Also like other files which gets created during compilation, example, .builtin-bench.o.cmd or .perf.o.cmd, create .shellcheck_log as hidden file. Example: tests/shell/.buildid.sh.shellcheck_log - Initial version used "command -v shellcheck" to check presence of shellcheck. But while testing SLES, hit an issue with using "command". In RHEL, /usr/bin/command is available as pa
[PATCH] perf vendor events: Update datasource event name to fix duplicate events
Running "perf list" on powerpc fails with segfault as below: ./perf list Segmentation fault (core dumped) This happens because of duplicate events in the json list. The powerpc Json event list contains some event with same event name, but different event code. They are: - PM_INST_FROM_L3MISS (Present in datasource and frontend) - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked) - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked) - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked) pmu_events_table__num_events uses the value from table_pmu->num_entries which includes duplicate events as well. This causes issue during "perf list" and results in segmentation fault. Since both event codes are valid, append _DSRC to the Data Source events (datasource.json), so that they would have a unique name. Also add PM_DATA_FROM_L2MISS_DSRC and PM_DATA_FROM_L3MISS_DSRC events. With the fix, perf list works as expected. Fixes: fc1435807533 ("perf vendor events power10: Update JSON/events") Signed-off-by: Athira Rajeev --- .../arch/powerpc/power10/datasource.json | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json index 6b0356f2d301..0eeaaf1a95b8 100644 --- a/tools/perf/pmu-events/arch/powerpc/power10/datasource.json +++ b/tools/perf/pmu-events/arch/powerpc/power10/datasource.json @@ -99,6 +99,11 @@ "EventName": "PM_INST_FROM_L2MISS", "BriefDescription": "The processor's instruction cache was reloaded from a source beyond the local core's L2 due to a demand miss." }, + { +"EventCode": "0x0003C000C040", +"EventName": "PM_DATA_FROM_L2MISS_DSRC", +"BriefDescription": "The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss." + }, { "EventCode": "0x00038010C040", "EventName": "PM_INST_FROM_L2MISS_ALL", @@ -161,9 +166,14 @@ }, { "EventCode": "0x00078000C040", -"EventName": "PM_INST_FROM_L3MISS", +"EventName": "PM_INST_FROM_L3MISS_DSRC", "BriefDescription": "The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss." }, + { +"EventCode": "0x0007C000C040", +"EventName": "PM_DATA_FROM_L3MISS_DSRC", +"BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss." + }, { "EventCode": "0x00078010C040", "EventName": "PM_INST_FROM_L3MISS_ALL", @@ -981,7 +991,7 @@ }, { "EventCode": "0x0003C000C142", -"EventName": "PM_MRK_DATA_FROM_L2MISS", +"EventName": "PM_MRK_DATA_FROM_L2MISS_DSRC", "BriefDescription": "The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction." }, { @@ -1046,12 +1056,12 @@ }, { "EventCode": "0x00078000C142", -"EventName": "PM_MRK_INST_FROM_L3MISS", +"EventName": "PM_MRK_INST_FROM_L3MISS_DSRC", "BriefDescription": "The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction." }, { "EventCode": "0x0007C000C142", -"EventName": "PM_MRK_DATA_FROM_L3MISS", +"EventName": "PM_MRK_DATA_FROM_L3MISS_DSRC", "BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction." }, { -- 2.39.3
Re: [PATCH V3] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 23-Oct-2023, at 4:14 PM, James Clark wrote: > > > > On 13/10/2023 08:36, Athira Rajeev wrote: >> Add rule in new Makefile "tests/Makefile.tests" for running >> shellcheck on shell test scripts. This automates below shellcheck >> into the build. >> >> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >> warning $F; done >> >> Condition for shellcheck is added in Makefile.perf to avoid build >> breakage in the absence of shellcheck binary. Update Makefile.perf >> to contain new rule for "SHELLCHECK_TEST" which is for making >> shellcheck test as a dependency on perf binary. Added >> "tests/Makefile.tests" to run shellcheck on shellscripts in >> tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every >> time during make, shellcheck will be run only on modified files >> during subsequent invocations. By this, if any newly added shell >> scripts or fixes in existing scripts breaks coding/formatting >> style, it will get captured during the perf build. >> >> Example build failure with present scripts in tests/shell: >> >> INSTALL libsubcmd_headers >> INSTALL libperf_headers >> INSTALL libapi_headers >> INSTALL libsymbol_headers >> INSTALL libbpf_headers >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/record_sideband.dep] Error 1 >> make[3]: *** Waiting for unfinished jobs >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/test_arm_coresight.dep] Error 1 >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/lock_contention.dep] Error 1 >> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 >> make[1]: *** [Makefile.perf:238: sub-make] Error 2 >> make: *** [Makefile:70: all] Error 2 >> >> After this, for testing, changed "tests/shell/record.sh" to >> break shellcheck format. In the next make run, it is >> also captured: >> >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/record_sideband.dep] Error 1 >> make[3]: *** Waiting for unfinished jobs >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/record.dep] Error 1 >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/test_arm_coresight.dep] Error 1 >> make[3]: *** >> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >> output/tests/shell/lock_contention.dep] Error 1 >> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 >> make[1]: *** [Makefile.perf:238: sub-make] Error 2 >> make: *** [Makefile:70: all] Error 2 >> >> The exact failure log can be found in: >> output/tests/shell/record.dep.log file >> > > Hi Athira, > > Having the reason for a hard failure put into a log file rather than the > console output is very non standard. I'm not sure what the reason for > this is. > > The log filename isn't even listed in the output so how would anyone > know what went wrong? > > Can we just have it so that the failure is printed in the make output > like any other build error. Sure James, Thanks for looking into and sharing the review comment. I will address the change in V4 > > [...] > >> +output/%.dep: %.sh | $(DIRS) >> + $(call rule_mkdir) >> + $(eval input_file := $(subst output/,./,$(patsubst %.dep, %.sh, $@))) >> + $(Q)$(call frecho-cmd,test)@shellcheck -S warning ${input_file} 1> $@.log >> && ([[ ! -s $@.log ]]) > > [[ ]] is a bash extension, but the build system seems to use /bin/sh so > you get this error depending on your distro: > > tools/perf/tests/Makefile.tests:17: output/tests/shell > /record+probe_libc_inet_pton.dep] Error 127 > /bin/sh: 1: [[: not found > > Changing it to [ ] fixes it Ok, will make the change in next version > >> + $(Q)$(call frecho-cmd,test)@touch $@ > > Touching the source file in the build system doesn't feel right, surely > this could be open to all kinds of parallel build race conditions or > version controll issues. > > Isn't the output of the rule the .log file, so just a normal make rule > based on those two files work? Then if the .log file is older than the > source file, the shellcheck is re-run, otherwise not? It feels like the > .dep file would then also be unecessary. Ok, I will fix this. > > The .dep lines in the make output are a bit confusing because they're > not in the source tree so it's not clear to an outsider what that make > output is for. > > Other than that, it does seem to work ok for me. Thanks for the review. I will post V4 with all the changes Athira > >> + $(Q)$(call frecho-cmd,test)@rm -rf $@.log >> +$(DIRS): >> + @mkdir -p $@ >> + >> +clean: >> + @rm -rf output
Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh
> On 07-Nov-2023, at 3:14 AM, Arnaldo Carvalho de Melo wrote: > > Em Thu, Oct 05, 2023 at 02:24:15PM +0530, Athira Rajeev escreveu: >>> On 05-Oct-2023, at 1:50 PM, James Clark wrote: >>> On 29/09/2023 05:11, Athira Rajeev wrote: >>>> Running shellcheck on tests/shell/test_arm_coresight.sh >>>> throws below warnings: >>>> >>>> In tests/shell/test_arm_coresight.sh line 15: >>>> cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* >>>> -print -quit) >>>> ^--^ SC2061: Quote the parameter to -name so the shell >>>> won't interpret it. >>>> >>>> In tests/shell/test_arm_coresight.sh line 20: >>>> if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then >>>> ^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q >>>> ] is not well defined >>>> >>>> This warning is observed after commit: >>>> "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")" >>>> >>>> Fixed this issue by using quoting 'cpu*' for SC2061 and >>>> using "&&" in line number 20 for SC2166 warning >>>> >>>> Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE") >>>> Signed-off-by: Athira Rajeev >>>> --- >>>> tools/perf/tests/shell/test_arm_coresight.sh | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/tools/perf/tests/shell/test_arm_coresight.sh >>>> b/tools/perf/tests/shell/test_arm_coresight.sh >>>> index fe78c4626e45..f2115dfa24a5 100755 >>>> --- a/tools/perf/tests/shell/test_arm_coresight.sh >>>> +++ b/tools/perf/tests/shell/test_arm_coresight.sh >>>> @@ -12,12 +12,12 @@ >>>> glb_err=0 >>>> >>>> cs_etm_dev_name() { >>>> - cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* >>>> -print -quit) >>>> + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' >>>> -print -quit) >>>> trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch) >>>> archhver=$((($trcdevarch >> 12) & 0xf)) >>>> archpart=$(($trcdevarch & 0xfff)) >>>> >>>> - if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then >>>> + if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; >>>> then >>>> echo "ete" >>>> else >>>> echo "etm" >>> >>> >>> Reviewed-by: James Clark > > Some are not applying, even after b4 picking up v2 > > Total patches: 3 > --- > Cover: > ./v2_20231013_atrajeev_fix_for_shellcheck_issues_with_latest_scripts_in_tests_shell.cover > Link: > https://lore.kernel.org/r/20231013073021.99794-1-atraj...@linux.vnet.ibm.com > Base: not specified > git am > ./v2_20231013_atrajeev_fix_for_shellcheck_issues_with_latest_scripts_in_tests_shell.mbx > ⬢[acme@toolbox perf-tools-next]$git am > ./v2_20231013_atrajeev_fix_for_shellcheck_issues_with_latest_scripts_in_tests_shell.mbx > Applying: tools/perf/tests Ignore the shellcheck SC2046 warning in > lock_contention > error: patch failed: tools/perf/tests/shell/lock_contention.sh:33 > error: tools/perf/tests/shell/lock_contention.sh: patch does not apply > Patch failed at 0001 tools/perf/tests Ignore the shellcheck SC2046 warning in > lock_contention > hint: Use 'git am --show-current-patch=diff' to see the failed patch > When you have resolved this problem, run "git am --continue". > If you prefer to skip this patch, run "git am --skip" instead. > To restore the original branch and stop patching, run "git am --abort". > ⬢[acme@toolbox perf-tools-next]$ git am --abort > ⬢[acme@toolbox perf-tools-next]$ Hi Arnaldo The patch is picked up : https://lore.kernel.org/all/169757198796.167943.10552920255799914362.b4...@kernel.org/ . Thanks for looking into. Athira
Re: [PATCH V3] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 13-Oct-2023, at 1:06 PM, Athira Rajeev wrote: > > Add rule in new Makefile "tests/Makefile.tests" for running > shellcheck on shell test scripts. This automates below shellcheck > into the build. > > $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S > warning $F; done > > Condition for shellcheck is added in Makefile.perf to avoid build > breakage in the absence of shellcheck binary. Update Makefile.perf > to contain new rule for "SHELLCHECK_TEST" which is for making > shellcheck test as a dependency on perf binary. Added > "tests/Makefile.tests" to run shellcheck on shellscripts in > tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every > time during make, shellcheck will be run only on modified files > during subsequent invocations. By this, if any newly added shell > scripts or fixes in existing scripts breaks coding/formatting > style, it will get captured during the perf build. > > Example build failure with present scripts in tests/shell: > > INSTALL libsubcmd_headers > INSTALL libperf_headers > INSTALL libapi_headers > INSTALL libsymbol_headers > INSTALL libbpf_headers > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/record_sideband.dep] Error 1 > make[3]: *** Waiting for unfinished jobs > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/test_arm_coresight.dep] Error 1 > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/lock_contention.dep] Error 1 > make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 > make[1]: *** [Makefile.perf:238: sub-make] Error 2 > make: *** [Makefile:70: all] Error 2 > > After this, for testing, changed "tests/shell/record.sh" to > break shellcheck format. In the next make run, it is > also captured: > > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/record_sideband.dep] Error 1 > make[3]: *** Waiting for unfinished jobs > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/record.dep] Error 1 > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/test_arm_coresight.dep] Error 1 > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/lock_contention.dep] Error 1 > make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 > make[1]: *** [Makefile.perf:238: sub-make] Error 2 > make: *** [Makefile:70: all] Error 2 > > The exact failure log can be found in: > output/tests/shell/record.dep.log file > > This is reported at build time. To be able to go ahead with > the build or disable shellcheck even though it is known that > some test is broken, add a "NO_SHELLCHECK" option. Example: > > make NO_LIBTRACEEVENT=1 NO_SHELLCHECK=1 > > INSTALL libsubcmd_headers > INSTALL libsymbol_headers > INSTALL libperf_headers > INSTALL libapi_headers > INSTALL libbpf_headers > PERF_VERSION = 6.6.rc1.g7108a40e02ae > GEN perf-iostat > GEN perf-archive > CC util/header.o > LD util/perf-in.o > LD perf-in.o > LINKperf > > Signed-off-by: Athira Rajeev > --- > changelog: > v2 -> v3: > Add option "NO_SHELLCHECK". This will allow to go ahead > with the build or disable shellcheck even though it is > known that some test is broken > > v1 -> v2: > Version1 had shellcheck in feature check which is not > required since shellcheck is already a binary. Presence > of binary can be checked using: > $(shell command -v shellcheck) > Addressed these changes as suggested by Namhyung in V2 > Feature test logic is removed in V2. Also added example > for build breakage when shellcheck fails in commit message Hi All, Looking for review comments on this patch Thanks Athira > > tools/perf/Makefile.perf| 20 +++- > tools/perf/tests/Makefile.tests | 24 > 2 files changed, 43 insertions(+), 1 deletion(-) > create mode 100644 tools/perf/tests/Makefile.tests > > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf > index 456872ac410d..bb49eb8b0d43 100644 > --- a/tools/perf/Makefile.perf > +++ b/tools/perf/Makefile.perf > @@ -134,6 +134,8 @@ include ../scripts/utilities.mak > # x86 instruction decoder - new instructions test > # > # Define GEN_VMLINUX_H to generate vmlinux.h from the BTF. > +# > +# Define
[PATCH V3] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
Add rule in new Makefile "tests/Makefile.tests" for running shellcheck on shell test scripts. This automates below shellcheck into the build. $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done Condition for shellcheck is added in Makefile.perf to avoid build breakage in the absence of shellcheck binary. Update Makefile.perf to contain new rule for "SHELLCHECK_TEST" which is for making shellcheck test as a dependency on perf binary. Added "tests/Makefile.tests" to run shellcheck on shellscripts in tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time during make, shellcheck will be run only on modified files during subsequent invocations. By this, if any newly added shell scripts or fixes in existing scripts breaks coding/formatting style, it will get captured during the perf build. Example build failure with present scripts in tests/shell: INSTALL libsubcmd_headers INSTALL libperf_headers INSTALL libapi_headers INSTALL libsymbol_headers INSTALL libbpf_headers make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/record_sideband.dep] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/test_arm_coresight.dep] Error 1 make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/lock_contention.dep] Error 1 make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 make[1]: *** [Makefile.perf:238: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 After this, for testing, changed "tests/shell/record.sh" to break shellcheck format. In the next make run, it is also captured: make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/record_sideband.dep] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/record.dep] Error 1 make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/test_arm_coresight.dep] Error 1 make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/lock_contention.dep] Error 1 make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 make[1]: *** [Makefile.perf:238: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 The exact failure log can be found in: output/tests/shell/record.dep.log file This is reported at build time. To be able to go ahead with the build or disable shellcheck even though it is known that some test is broken, add a "NO_SHELLCHECK" option. Example: make NO_LIBTRACEEVENT=1 NO_SHELLCHECK=1 INSTALL libsubcmd_headers INSTALL libsymbol_headers INSTALL libperf_headers INSTALL libapi_headers INSTALL libbpf_headers PERF_VERSION = 6.6.rc1.g7108a40e02ae GEN perf-iostat GEN perf-archive CC util/header.o LD util/perf-in.o LD perf-in.o LINKperf Signed-off-by: Athira Rajeev --- changelog: v2 -> v3: Add option "NO_SHELLCHECK". This will allow to go ahead with the build or disable shellcheck even though it is known that some test is broken v1 -> v2: Version1 had shellcheck in feature check which is not required since shellcheck is already a binary. Presence of binary can be checked using: $(shell command -v shellcheck) Addressed these changes as suggested by Namhyung in V2 Feature test logic is removed in V2. Also added example for build breakage when shellcheck fails in commit message tools/perf/Makefile.perf| 20 +++- tools/perf/tests/Makefile.tests | 24 2 files changed, 43 insertions(+), 1 deletion(-) create mode 100644 tools/perf/tests/Makefile.tests diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 456872ac410d..bb49eb8b0d43 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -134,6 +134,8 @@ include ../scripts/utilities.mak # x86 instruction decoder - new instructions test # # Define GEN_VMLINUX_H to generate vmlinux.h from the BTF. +# +# Define NO_SHELLCHECK if you do not want to run shellcheck during build # As per kernel Makefile, avoid funny character set dependencies unexport LC_ALL @@ -671,7 +673,22 @@ $(PERF_IN): prepare FORCE $(PMU_EVENTS_IN): FORCE prepare $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events obj=pmu-events -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) +# Runs shellcheck on perf test shell scripts + +SHELLCHECK := $(shell command -v shellcheck) +ifeq ($(NO_SHELLCHECK
[PATCH V2 0/3] Fix for shellcheck issues with latest scripts in tests/shell
shellcheck was run on perf tool shell scripts as a pre-requisite to include a build option for shellcheck discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html And fixes were added for the coding/formatting issues in two patchsets: https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ Three additional issues were observed and fixes are part of: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git With recent commits in perf, other three issues are observed. shellcheck version: 0.6.0 The changes are with recent commits ( which is mentioned in each patch) for lock_contention, record_sideband and stat_all_metricgroups test. Patch series fixes these testcases and patches are on top of: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git The version 1 patchset had fix patch for test_arm_coresight.sh. Dropping that in V2 based on discussion here: https://lore.kernel.org/linux-perf-users/f265857d-0d37-4878-908c-d20732f21...@linux.vnet.ibm.com/T/#u Athira Rajeev (3): tools/perf/tests Ignore the shellcheck SC2046 warning in lock_contention tools/perf/tests: Fix shellcheck warning in record_sideband.sh test tools/perf/tests/shell: Fix shellcheck warning SC2112 with stat_all_metricgroups tools/perf/tests/shell/lock_contention.sh | 1 + tools/perf/tests/shell/record_sideband.sh | 2 +- tools/perf/tests/shell/stat_all_metricgroups.sh | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) -- 2.31.1
[PATCH V2 2/3] tools/perf/tests: Fix shellcheck warning in record_sideband.sh test
Running shellcheck on record_sideband.sh throws below warning: In tests/shell/record_sideband.sh line 25: if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 >/dev/null ^--^ SC2069: To redirect stdout+stderr, 2>&1 must be last (or use '{ cmd > file; } 2>&1' to clarify). This shows shellcheck warning SC2069 where the redirection order needs to be fixed. Use "cmd > /dev/null 2>&1" to fix the redirection of perf record output Fixes: 23b97c7ee963 ("perf test: Add test case for record sideband events") Signed-off-by: Athira Rajeev Reviewed-by: Kajol Jain --- changelog: v1 -> v2: Add Reviewed-by from Kajol Used "cmd > /dev/null 2>&1" to fix the redirection warning from shellcheck tools/perf/tests/shell/record_sideband.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/record_sideband.sh b/tools/perf/tests/shell/record_sideband.sh index 5024a7ce0c51..ac70ac27d590 100755 --- a/tools/perf/tests/shell/record_sideband.sh +++ b/tools/perf/tests/shell/record_sideband.sh @@ -22,7 +22,7 @@ trap trap_cleanup EXIT TERM INT can_cpu_wide() { -if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 >/dev/null +if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true > /dev/null 2>&1 then echo "record sideband test [Skipped cannot record cpu$1]" err=2 -- 2.31.1
[PATCH V2 1/3] tools/perf/tests Ignore the shellcheck SC2046 warning in lock_contention
Running shellcheck on lock_contention.sh generates below warning In tests/shell/lock_contention.sh line 36: if [ `nproc` -lt 4 ]; then ^-^ SC2046: Quote this to prevent word splitting. Here since nproc will generate a single word output and there is no possibility of word splitting, this warning can be ignored. Use exception for this with "disable" option in shellcheck. This warning is observed after commit: "commit 29441ab3a30a ("perf test lock_contention.sh: Skip test if not enough CPUs")" Fixes: 29441ab3a30a ("perf test lock_contention.sh: Skip test if not enough CPUs") Signed-off-by: Athira Rajeev Reviewed-by: Kajol Jain --- changelog: v1 -> v2: Add Reviewed-by from Kajol Jain tools/perf/tests/shell/lock_contention.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/tests/shell/lock_contention.sh b/tools/perf/tests/shell/lock_contention.sh index d5a191d3d090..c1ec5762215b 100755 --- a/tools/perf/tests/shell/lock_contention.sh +++ b/tools/perf/tests/shell/lock_contention.sh @@ -33,6 +33,7 @@ check() { exit fi + # shellcheck disable=SC2046 if [ `nproc` -lt 4 ]; then echo "[Skip] Low number of CPUs (`nproc`), lock event cannot be triggered certainly" err=2 -- 2.31.1
[PATCH V2 3/3] tools/perf/tests/shell: Fix shellcheck warning SC2112 with stat_all_metricgroups
Running shellcheck on stat_all_metricgroups.sh reports below warning: In ./tests/shell/stat_all_metricgroups.sh line 7: function ParanoidAndNotRoot() ^-- SC2112: 'function' keyword is non-standard. Delete it. As per the format, "function" is a non-standard keyword that can be used to declare functions. Fix this by removing the "function" keyword from ParanoidAndNotRoot function Signed-off-by: Athira Rajeev --- Changelog: This is a new patch added in V2 tools/perf/tests/shell/stat_all_metricgroups.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/stat_all_metricgroups.sh b/tools/perf/tests/shell/stat_all_metricgroups.sh index f3e305649e2c..55ef9c9ded2d 100755 --- a/tools/perf/tests/shell/stat_all_metricgroups.sh +++ b/tools/perf/tests/shell/stat_all_metricgroups.sh @@ -4,7 +4,7 @@ set -e -function ParanoidAndNotRoot() +ParanoidAndNotRoot() { [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ] } -- 2.31.1
Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh
> On 12-Oct-2023, at 9:37 PM, Suzuki K Poulose wrote: > > Hi, > > On 12/10/2023 16:56, Athira Rajeev wrote: >>> On 05-Oct-2023, at 3:06 PM, Suzuki K Poulose wrote: >>> >>> On 05/10/2023 06:02, Namhyung Kim wrote: >>>> On Thu, Sep 28, 2023 at 9:11 PM Athira Rajeev >>>> wrote: >>>>> > > ... > >>> Thanks for the fix. >>> >>> Nothing to do with this patch, but I am wondering if the original patch >>> is over engineered and may not be future proof. >>> >>> e.g., >>> >>> cs_etm_dev_name() { >>> + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* >>> -print -quit) >>> >>> Right there you got the device name and we can easily deduce the name of >>> the "ETM" node. >>> >>> e.g,: >>> etm=$(basename $(readlink cs_etm_path) | sed "s/[0-9]\+$//") >>> >>> And practically, nobody prevents an ETE mixed with an ETM on a "hybrid" >>> system (hopefully, no one builds it ;-)) >>> >>> Also, instead of hardcoding "ete" and "etm" prefixes from the arch part, >>> we should simply use the cpu nodes from : >>> >>> /sys/bus/event_source/devices/cs_etm/ >>> >>> e.g., >>> >>> arm_cs_etm_traverse_path_test() { >>> # Iterate for every ETM device >>> for c in /sys/bus/event_source/devices/cs_etm/cpu*; do >>> # Read the link to be on the safer side >>> dev=`readlink $c` >>> >>> # Find the ETM device belonging to which CPU >>> cpu=`cat $dev/cpu` >>> >>> # Use depth-first search (DFS) to iterate outputs >>> arm_cs_iterate_devices $dev $cpu >>> done; >>> } >>> >>> >>> >>>> You'd better add Coresight folks on this. >>>> Maybe this file was missing in the MAINTAINERS file. >>> >>> And the original author of the commit, that introduced the issue too. >>> >>> Suzuki >> Hi All, >> Thanks for the discussion and feedbacks. >> This patch fixes the shellcheck warning introduced in function >> "cs_etm_dev_name". But with the changes that Suzuki suggested, we won't need >> the function "cs_etm_dev_name" since the code will use >> "/sys/bus/event_source/devices/cs_etm/" . In that case, can I drop this >> patch for now from this series ? > > Yes, please. James will send out the proposed patch Hi Suzuki, Sure. Thanks! Athira > > Suzuki > >
Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh
> On 05-Oct-2023, at 3:06 PM, Suzuki K Poulose wrote: > > On 05/10/2023 06:02, Namhyung Kim wrote: >> On Thu, Sep 28, 2023 at 9:11 PM Athira Rajeev >> wrote: >>> >>> Running shellcheck on tests/shell/test_arm_coresight.sh >>> throws below warnings: >>> >>> In tests/shell/test_arm_coresight.sh line 15: >>> cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name >>> cpu* -print -quit) >>> ^--^ SC2061: Quote the parameter to -name so the shell >>> won't interpret it. >>> >>> In tests/shell/test_arm_coresight.sh line 20: >>> if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = >>> "0xA13" ] ; then >>> ^-- SC2166: Prefer [ p ] && [ q ] as [ >>> p -a q ] is not well defined >>> >>> This warning is observed after commit: >>> "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")" >>> >>> Fixed this issue by using quoting 'cpu*' for SC2061 and >>> using "&&" in line number 20 for SC2166 warning >>> >>> Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE") >>> Signed-off-by: Athira Rajeev > > Thanks for the fix. > > Nothing to do with this patch, but I am wondering if the original patch > is over engineered and may not be future proof. > > e.g., > > cs_etm_dev_name() { > + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* -print > -quit) > > Right there you got the device name and we can easily deduce the name of > the "ETM" node. > > e.g,: > etm=$(basename $(readlink cs_etm_path) | sed "s/[0-9]\+$//") > > And practically, nobody prevents an ETE mixed with an ETM on a "hybrid" > system (hopefully, no one builds it ;-)) > > Also, instead of hardcoding "ete" and "etm" prefixes from the arch part, > we should simply use the cpu nodes from : > > /sys/bus/event_source/devices/cs_etm/ > > e.g., > > arm_cs_etm_traverse_path_test() { > # Iterate for every ETM device > for c in /sys/bus/event_source/devices/cs_etm/cpu*; do > # Read the link to be on the safer side > dev=`readlink $c` > > # Find the ETM device belonging to which CPU > cpu=`cat $dev/cpu` > > # Use depth-first search (DFS) to iterate outputs > arm_cs_iterate_devices $dev $cpu > done; > } > > > >> You'd better add Coresight folks on this. >> Maybe this file was missing in the MAINTAINERS file. > > And the original author of the commit, that introduced the issue too. > > Suzuki Hi All, Thanks for the discussion and feedbacks. This patch fixes the shellcheck warning introduced in function "cs_etm_dev_name". But with the changes that Suzuki suggested, we won't need the function "cs_etm_dev_name" since the code will use "/sys/bus/event_source/devices/cs_etm/" . In that case, can I drop this patch for now from this series ? Thanks Athira > >> Thanks, >> Namhyung >>> --- >>> tools/perf/tests/shell/test_arm_coresight.sh | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/tools/perf/tests/shell/test_arm_coresight.sh >>> b/tools/perf/tests/shell/test_arm_coresight.sh >>> index fe78c4626e45..f2115dfa24a5 100755 >>> --- a/tools/perf/tests/shell/test_arm_coresight.sh >>> +++ b/tools/perf/tests/shell/test_arm_coresight.sh >>> @@ -12,12 +12,12 @@ >>> glb_err=0 >>> >>> cs_etm_dev_name() { >>> - cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name >>> cpu* -print -quit) >>> + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name >>> 'cpu*' -print -quit) >>> trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch) >>> archhver=$((($trcdevarch >> 12) & 0xf)) >>> archpart=$(($trcdevarch & 0xfff)) >>> >>> - if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] >>> ; then >>> + if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = >>> "0xA13" ] ; then >>> echo "ete" >>> else >>> echo "etm" >>> -- >>> 2.31.1
Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh
> On 05-Oct-2023, at 9:15 PM, David Laight wrote: > > From: David Laight >> Sent: 05 October 2023 11:16 > ... >>> - cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* >>> -print -quit) >>> + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' >>> -print -quit) >> >> Isn't the intention to get the shell to expand "cpu* ? >> So quoting it completely breaks the script. > > Complete brain-fade :-( Hi David, Yeah, quoting it also will expand Thanks Athira > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > 1PT, UK > Registration No: 1397386 (Wales) >
Re: [PATCH V2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 09-Oct-2023, at 10:38 AM, Namhyung Kim wrote: > > Hello, > > Sorry for the late reply. > > On Thu, Oct 5, 2023 at 8:27 AM Athira Rajeev > wrote: >> >> >> >>> On 29-Sep-2023, at 12:19 PM, Athira Rajeev >>> wrote: >>> >>> Add rule in new Makefile "tests/Makefile.tests" for running >>> shellcheck on shell test scripts. This automates below shellcheck >>> into the build. >>> >>> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >>> warning $F; done >>> >>> Condition for shellcheck is added in Makefile.perf to avoid build >>> breakage in the absence of shellcheck binary. Update Makefile.perf >>> to contain new rule for "SHELLCHECK_TEST" which is for making >>> shellcheck test as a dependency on perf binary. Added >>> "tests/Makefile.tests" to run shellcheck on shellscripts in >>> tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every >>> time during make, shellcheck will be run only on modified files >>> during subsequent invocations. By this, if any newly added shell >>> scripts or fixes in existing scripts breaks coding/formatting >>> style, it will get captured during the perf build. >>> >>> Example build failure with present scripts in tests/shell: >>> >>> INSTALL libsubcmd_headers >>> INSTALL libperf_headers >>> INSTALL libapi_headers >>> INSTALL libsymbol_headers >>> INSTALL libbpf_headers >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/record_sideband.dep] Error 1 >>> make[3]: *** Waiting for unfinished jobs >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/test_arm_coresight.dep] Error 1 >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/lock_contention.dep] Error 1 >>> make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 >>> make[1]: *** [Makefile.perf:238: sub-make] Error 2 >>> make: *** [Makefile:70: all] Error 2 >>> >>> After this, for testing, changed "tests/shell/record.sh" to >>> break shellcheck format. In the next make run, it is >>> also captured: > > Where can I see the actual failure messages? Hi Namhyung, Thanks for the review comments. Example, with current tree, we have these format issues: GENSKEL /root/athira/linux/tools/perf/util/bpf_skel/kwork_trace.skel.h CC bench/uprobe.o CC util/header.o LD bench/perf-in.o make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:17: output/tests/shell/stat_all_metricgroups.dep] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:17: output/tests/shell/record_sideband.dep] Error 1 CC util/bpf_counter.o CC util/bpf_counter_cgroup.o CC util/bpf_ftrace.o CC util/bpf_off_cpu.o CC util/bpf-filter.o make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:15: output/tests/shell/test_arm_coresight.dep] Error 1 make[3]: *** [/root/athira/linux/tools/perf/tests/Makefile.tests:15: output/tests/shell/lock_contention.dep] Error 1 make[2]: *** [Makefile.perf:679: SHELLCHECK_TEST] Error 2 make[2]: *** Waiting for unfinished jobs LD util/perf-in.o LD perf-in.o make[1]: *** [Makefile.perf:242: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 The actual fail can be seen here: # cat output/tests/shell/stat_all_metricgroups.dep.log In ./tests/shell/stat_all_metricgroups.sh line 7: function ParanoidAndNotRoot() ^-- SC2112: 'function' keyword is non-standard. Delete it. For more information: https://www.shellcheck.net/wiki/SC2112 -- 'function' keyword is non-standar... > >>> >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/record_sideband.dep] Error 1 >>> make[3]: *** Waiting for unfinished jobs >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/record.dep] Error 1 >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/test_arm_coresight.dep] Error 1 >>> make[3]: *** >>> [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: >>> output/tests/shell/lock_contention.dep] Error 1 >>> make[2]: *** [Makefile.perf:675: SHELLCHECK_TE
[PATCH] tools/perf/arch/powerpc: Fix the CPU ID const char* value by adding 0x prefix
Simple expression parser test fails in powerpc as below: 4: Simple expression parser test child forked, pid 170385 Using CPUID 004e2102 division by zero syntax error syntax error FAILED tests/expr.c:65 parse test failed test child finished with -1 Simple expression parser: FAILED! This is observed after commit: 'commit 9d5da30e4ae9 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()")' With this commit, a new expression builtin strcmp_cpuid_str got added. This function takes an 'ID' type value, which is a string. So expression parse for strcmp_cpuid_str expects const char * as cpuid value type. In case of powerpc, CPU IDs are numbers. Hence it doesn't get interpreted correctly by bison parser. Example in case of power9, cpuid string returns as: 004e2102 cpuid of string type is expected in two cases: 1. char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused); Testcase "tests/expr.c" uses "perf_pmu__getcpuid" which calls get_cpuid_str to get the cpuid string. 2. cpuid field in :struct pmu_events_map struct pmu_events_map { const char *arch; const char *cpuid; Here cpuid field is used in "perf_pmu__find_events_table" function as "strcmp_cpuid_str(map->cpuid, cpuid)". The value for cpuid field is picked from mapfile.csv. Fix the mapfile.csv and get_cpuid_str function to prefix cpuid with 0x so that it gets correctly interpreted by the bison parser Signed-off-by: Athira Rajeev --- tools/perf/arch/powerpc/util/header.c | 2 +- tools/perf/pmu-events/arch/powerpc/mapfile.csv | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c index c8d0dc775e5d..6b00efd53638 100644 --- a/tools/perf/arch/powerpc/util/header.c +++ b/tools/perf/arch/powerpc/util/header.c @@ -34,7 +34,7 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused) { char *bufp; - if (asprintf(, "%.8lx", mfspr(SPRN_PVR)) < 0) + if (asprintf(, "0x%.8lx", mfspr(SPRN_PVR)) < 0) bufp = NULL; return bufp; diff --git a/tools/perf/pmu-events/arch/powerpc/mapfile.csv b/tools/perf/pmu-events/arch/powerpc/mapfile.csv index a534ff6db14b..f4908af7ad66 100644 --- a/tools/perf/pmu-events/arch/powerpc/mapfile.csv +++ b/tools/perf/pmu-events/arch/powerpc/mapfile.csv @@ -13,6 +13,6 @@ # # Power8 entries -004[bcd][[:xdigit:]]{4},1,power8,core -004e[[:xdigit:]]{4},1,power9,core -0080[[:xdigit:]]{4},1,power10,core +0x004[bcd][[:xdigit:]]{4},1,power8,core +0x004e[[:xdigit:]]{4},1,power9,core +0x0080[[:xdigit:]]{4},1,power10,core -- 2.39.3
Re: [PATCH V2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 29-Sep-2023, at 12:19 PM, Athira Rajeev > wrote: > > Add rule in new Makefile "tests/Makefile.tests" for running > shellcheck on shell test scripts. This automates below shellcheck > into the build. > > $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S > warning $F; done > > Condition for shellcheck is added in Makefile.perf to avoid build > breakage in the absence of shellcheck binary. Update Makefile.perf > to contain new rule for "SHELLCHECK_TEST" which is for making > shellcheck test as a dependency on perf binary. Added > "tests/Makefile.tests" to run shellcheck on shellscripts in > tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every > time during make, shellcheck will be run only on modified files > during subsequent invocations. By this, if any newly added shell > scripts or fixes in existing scripts breaks coding/formatting > style, it will get captured during the perf build. > > Example build failure with present scripts in tests/shell: > > INSTALL libsubcmd_headers > INSTALL libperf_headers > INSTALL libapi_headers > INSTALL libsymbol_headers > INSTALL libbpf_headers > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/record_sideband.dep] Error 1 > make[3]: *** Waiting for unfinished jobs > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/test_arm_coresight.dep] Error 1 > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/lock_contention.dep] Error 1 > make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 > make[1]: *** [Makefile.perf:238: sub-make] Error 2 > make: *** [Makefile:70: all] Error 2 > > After this, for testing, changed "tests/shell/record.sh" to > break shellcheck format. In the next make run, it is > also captured: > > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/record_sideband.dep] Error 1 > make[3]: *** Waiting for unfinished jobs > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/record.dep] Error 1 > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/test_arm_coresight.dep] Error 1 > make[3]: *** > [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: > output/tests/shell/lock_contention.dep] Error 1 > make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 > make[1]: *** [Makefile.perf:238: sub-make] Error 2 > make: *** [Makefile:70: all] Error 2 > > Signed-off-by: Athira Rajeev > --- > Changelog: > v1 -> v2: > Version1 had shellcheck in feature check which is not > required since shellcheck is already a binary. Presence > of binary can be checked using: > $(shell command -v shellcheck) > Addressed these changes as suggested by Namhyung in V2 > Feature test logic is removed in V2. Also added example > for build breakage when shellcheck fails in commit message Hi All, Looking for feedback on this patch Thanks Athira > > tools/perf/Makefile.perf| 14 +- > tools/perf/tests/Makefile.tests | 24 > 2 files changed, 37 insertions(+), 1 deletion(-) > create mode 100644 tools/perf/tests/Makefile.tests > > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf > index 98604e396ac3..56a66ca253ab 100644 > --- a/tools/perf/Makefile.perf > +++ b/tools/perf/Makefile.perf > @@ -667,7 +667,18 @@ $(PERF_IN): prepare FORCE > $(PMU_EVENTS_IN): FORCE prepare > $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events > obj=pmu-events > > -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) > +# Runs shellcheck on perf test shell scripts > + > +SHELLCHECK := $(shell command -v shellcheck) > +ifneq ($(SHELLCHECK),) > +SHELLCHECK_TEST: FORCE prepare > + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests > +else > +SHELLCHECK_TEST: > + @: > +endif > + > +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST > $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \ > $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@ > > @@ -1130,6 +1141,7 @@ bpf-skel-clean: > $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) > > clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean > $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean > tests-coresight-targets-clean > + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean > $(
Re: [PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh
> On 05-Oct-2023, at 1:50 PM, James Clark wrote: > > > > On 29/09/2023 05:11, Athira Rajeev wrote: >> Running shellcheck on tests/shell/test_arm_coresight.sh >> throws below warnings: >> >> In tests/shell/test_arm_coresight.sh line 15: >> cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* -print >> -quit) >> ^--^ SC2061: Quote the parameter to -name so the shell >> won't interpret it. >> >> In tests/shell/test_arm_coresight.sh line 20: >> if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then >> ^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q ] >> is not well defined >> >> This warning is observed after commit: >> "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")" >> >> Fixed this issue by using quoting 'cpu*' for SC2061 and >> using "&&" in line number 20 for SC2166 warning >> >> Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE") >> Signed-off-by: Athira Rajeev >> --- >> tools/perf/tests/shell/test_arm_coresight.sh | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/tools/perf/tests/shell/test_arm_coresight.sh >> b/tools/perf/tests/shell/test_arm_coresight.sh >> index fe78c4626e45..f2115dfa24a5 100755 >> --- a/tools/perf/tests/shell/test_arm_coresight.sh >> +++ b/tools/perf/tests/shell/test_arm_coresight.sh >> @@ -12,12 +12,12 @@ >> glb_err=0 >> >> cs_etm_dev_name() { >> - cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* >> -print -quit) >> + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' >> -print -quit) >> trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch) >> archhver=$((($trcdevarch >> 12) & 0xf)) >> archpart=$(($trcdevarch & 0xfff)) >> >> - if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then >> + if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; >> then >> echo "ete" >> else >> echo "etm" > > > Reviewed-by: James Clark Thanks James for checking Athira
Re: [PATCH 3/3] tools/perf/tests: Fix shellcheck warning in record_sideband.sh test
> On 05-Oct-2023, at 10:34 AM, Namhyung Kim wrote: > > On Thu, Sep 28, 2023 at 9:11 PM Athira Rajeev > wrote: >> >> Running shellcheck on record_sideband.sh throws below >> warning: >> >>In tests/shell/record_sideband.sh line 25: >> if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 >> >/dev/null >>^--^ SC2069: To redirect stdout+stderr, 2>&1 must be last (or use >> '{ cmd > file; } 2>&1' to clarify). >> >> This shows shellcheck warning SC2069 where the redirection >> order needs to be fixed. Use { cmd > file; } 2>&1 to fix the >> redirection of perf record output >> >> Fixes: 23b97c7ee963 ("perf test: Add test case for record sideband events") >> Signed-off-by: Athira Rajeev >> --- >> tools/perf/tests/shell/record_sideband.sh | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tools/perf/tests/shell/record_sideband.sh >> b/tools/perf/tests/shell/record_sideband.sh >> index 5024a7ce0c51..7e036763a43c 100755 >> --- a/tools/perf/tests/shell/record_sideband.sh >> +++ b/tools/perf/tests/shell/record_sideband.sh >> @@ -22,7 +22,7 @@ trap trap_cleanup EXIT TERM INT >> >> can_cpu_wide() >> { >> -if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 >> >/dev/null >> +if ! { perf record -o ${perfdata} -BN --no-bpf-event -C $1 true > >> /dev/null; } 2>&1 > > I think we usually go without braces. Hi Namhyung Thanks for reviving.I will fix this in V2 Thanks Athira > > Thanks, > Namhyung > > >> then >> echo "record sideband test [Skipped cannot record cpu$1]" >> err=2 >> -- >> 2.31.1
Re: [PATCH V5 1/3] tools/perf: Add text_end to "struct dso" to save .text section size
> On 03-Oct-2023, at 9:58 AM, Namhyung Kim wrote: > > Hello, > > On Thu, Sep 28, 2023 at 12:52 AM Athira Rajeev > wrote: >> >> Update "struct dso" to include new member "text_end". >> This new field will represent the offset for end of text >> section for a dso. For elf, this value is derived as: >> sh_size (Size of section in byes) + sh_offset (Section file >> offst) of the elf header for text. >> >> For bfd, this value is derived as: >> 1. For PE file, >> section->size + ( section->vma - dso->text_offset) >> 2. Other cases: >> section->filepos (file position) + section->size (size of >> section) >> >> To resolve the address from a sample, perf looks at the >> DSO maps. In case of address from a kernel module, there >> were some address found to be not resolved. This was >> observed while running perf test for "Object code reading". >> Though the ip falls beteen the start address of the loaded >> module (perf map->start ) and end address ( perf map->end), >> it was unresolved. >> >> Example: >> >>Reading object code for memory address: 0xc00807f0142c >>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>On file address is: 0x1114cc >>Objdump command is: objdump -z -d --start-address=0x11142c >> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>objdump read too few bytes: 128 >>test child finished with -1 >> >> Here, module is loaded at: >># cat /proc/modules | grep xfs >>xfs 2228224 3 - Live 0xc00807d0 >> >> From objdump for xfs module, text section is: >>text 0010f7bc 00a0 2**4 >> >> Here the offset for 0xc00807f0142c ie 0x112074 falls out >> .text section which is up to 0x10f7bc. >> >> In this case for module, the address 0xc00807e11fd4 is pointing >> to stub instructions. This address range represents the module stubs >> which is allocated on module load and hence is not part of DSO offset. >> >> To identify such address, which falls out of text >> section and within module end, added the new field "text_end" to >> "struct dso". >> >> Reported-by: Disha Goel >> Signed-off-by: Athira Rajeev >> Reviewed-by: Adrian Hunter >> Reviewed-by: Kajol Jain > > For the series, > Acked-by: Namhyung Kim > > Thanks, > Namhyung Thanks for checking Namhyung, Athira > > >> --- >> Changelog: >> v2 -> v3: >> Added Reviewed-by from Adrian >> >> v1 -> v2: >> Added text_end for bfd also by updating dso__load_bfd_symbols >> as suggested by Adrian. >> >> tools/perf/util/dso.h| 1 + >> tools/perf/util/symbol-elf.c | 4 +++- >> tools/perf/util/symbol.c | 2 ++ >> 3 files changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h >> index b41c9782c754..70fe0fe69bef 100644 >> --- a/tools/perf/util/dso.h >> +++ b/tools/perf/util/dso.h >> @@ -181,6 +181,7 @@ struct dso { >>u8 rel; >>struct build_id bid; >>u64 text_offset; >> + u64 text_end; >>const char *short_name; >>const char *long_name; >>u16 long_name_len; >> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c >> index 95e99c332d7e..9e7eeaf616b8 100644 >> --- a/tools/perf/util/symbol-elf.c >> +++ b/tools/perf/util/symbol-elf.c >> @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map >> *map, struct symsrc *syms_ss, >>} >> >>if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, , >> - ".text", NULL)) >> + ".text", NULL)) { >>dso->text_offset = tshdr.sh_addr - tshdr.sh_offset; >> + dso->text_end = tshdr.sh_offset + tshdr.sh_size; >> + } >> >>if (runtime_ss->opdsec) >>opddata = elf_rawdata(runtime_ss->opdsec, NULL); >> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c >> index 3f36675b7c8f..f25e4e62cf25 100644 >> --- a/tools/perf/util/symbol.c >> +++ b/tools/perf/util/symbol.c >> @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char >> *debugfile) >>/* PE symbols can only have 4 bytes, so use .text >> high bits */ >>dso->text_offset = section->vma - (u32)section->vma; >>dso->text_offset += >> (u32)bfd_asymbol_value(symbols[i]); >> + dso->text_end = (section->vma - dso->text_offset) + >> section->size; >>} else { >>dso->text_offset = section->vma - section->filepos; >> + dso->text_end = section->filepos + section->size; >>} >>} >> >> -- >> 2.31.1
Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"
> On 28-Sep-2023, at 9:24 AM, Namhyung Kim wrote: > > On Tue, Sep 26, 2023 at 9:29 PM Athira Rajeev > wrote: >> >> >> >>> On 25-Sep-2023, at 1:34 PM, kajoljain wrote: >>> >>> >>> >>> On 9/7/23 22:45, Athira Rajeev wrote: >>>> From: root >>>> >>>> shellcheck was run on perf tool shell scripts s a pre-requisite >>>> to include a build option for shellcheck discussed here: >>>> https://www.spinics.net/lists/linux-perf-users/msg25553.html >>>> >>>> And fixes were added for the coding/formatting issues in >>>> two patchsets: >>>> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ >>>> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ >>>> >>>> Three additional issues are observed with shellcheck "0.6" and >>>> this patchset covers those. With this patchset, >>>> >>>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >>>> warning $F; done >>>> # echo $? >>>> 0 >>>> >>> >>> Patchset looks good to me. >>> >>> Reviewed-by: Kajol Jain >>> >>> Thanks, >>> Kajol Jain >>> >> >> Hi Namhyunbg, >> >> Can you please check for this patchset also > > Sure, it's applied to perf-tools-next, thanks! Thanks Namhyung Athira
Re: [PATCH V3] perf test: Fix parse-events tests to skip parametrized events
> On 30-Sep-2023, at 11:23 AM, Namhyung Kim wrote: > > On Wed, Sep 27, 2023 at 11:17 AM Athira Rajeev > wrote: >> >> Testcase "Parsing of all PMU events from sysfs" parse events for >> all PMUs, and not just cpu. In case of powerpc, the PowerVM >> environment supports events from hv_24x7 and hv_gpci PMU which >> is of example format like below: >> >> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/ >> - hv_gpci/event,partition_id=?/ >> >> The value for "?" needs to be filled in depending on system >> configuration. It is better to skip these parametrized events >> in this test as it is done in: >> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip >> parametrized events")' which handled a simialr instance with >> "all PMU test". >> >> Fix parse-events test to skip parametrized events since >> it needs proper setup of the parameters. >> >> Signed-off-by: Athira Rajeev >> Tested-by: Ian Rogers >> Tested-by: Sachin Sant >> Reviewed-by: Kajol Jain > > Applied to perf-tools-next, thanks! Thanks Namhyung, Athira
[PATCH 1/2] powerpc/platforms/pseries: Fix STK_PARAM access in the hcall tracing code
In powerpc pseries system, below behaviour is observed while enabling tracing on hcall: # cd /sys/kernel/debug/tracing/ # cat events/powerpc/hcall_exit/enable 0 # echo 1 > events/powerpc/hcall_exit/enable # ls -bash: fork: Bad address Above is from power9 lpar with latest kernel. Past this, softlockup is observed. Initially while attempting via perf_event_open to use "PERF_TYPE_TRACEPOINT", kernel panic was observed. perf config used: memset([1],0,sizeof(struct perf_event_attr)); pe[1].type=PERF_TYPE_TRACEPOINT; pe[1].size=96; pe[1].config=0x26ULL; /* 38 raw_syscalls/sys_exit */ pe[1].sample_type=0; /* 0 */ pe[1].read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP|0x10ULL; /* 1f */ pe[1].inherit=1; pe[1].precise_ip=0; /* arbitrary skid */ pe[1].wakeup_events=0; pe[1].bp_type=HW_BREAKPOINT_EMPTY; pe[1].config1=0x1ULL; Kernel panic logs: == Kernel attempted to read user page (8) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x0008 Faulting instruction address: 0xc04c2814 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: nfnetlink bonding tls rfkill sunrpc dm_service_time dm_multipath pseries_rng xts vmx_crypto xfs libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg ibmvfc scsi_transport_fc ibmveth dm_mirror dm_region_hash dm_log dm_mod fuse CPU: 0 PID: 1431 Comm: login Not tainted 6.4.0+ #1 Hardware name: IBM,8375-42A POWER9 (raw) 0x4e0202 0xf05 of:IBM,FW950.30 (VL950_892) hv:phyp pSeries NIP [c04c2814] page_remove_rmap+0x44/0x320 LR [c049c2a4] wp_page_copy+0x384/0xec0 Call Trace: [c98c7ad0] [c0001416e400] 0xc0001416e400 (unreliable) [c98c7b20] [c049c2a4] wp_page_copy+0x384/0xec0 [c98c7bf0] [c04a4f64] __handle_mm_fault+0x9d4/0xfb0 [c98c7cf0] [c04a5630] handle_mm_fault+0xf0/0x350 [c98c7d40] [c0094e8c] ___do_page_fault+0x48c/0xc90 [c98c7df0] [c00958a0] hash__do_page_fault+0x30/0x70 [c98c7e20] [c009e244] do_hash_fault+0x1a4/0x330 [c98c7e50] [c0008918] data_access_common_virt+0x198/0x1f0 --- interrupt: 300 at 0x7fffae971abc git bisect tracked this down to below commit: 'commit baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead")' This commit changed STACK_FRAME_OVERHEAD (112 ) to STACK_FRAME_MIN_SIZE (32 ) since 32 bytes is the minimum size for ELFv2 stack. With the latest kernel, when running on ELFv2, STACK_FRAME_MIN_SIZE is used to allocate stack size. During plpar_hcall_trace, first call is made to HCALL_INST_PRECALL which saves the registers and allocates new stack frame. In the plpar_hcall_trace code, STK_PARAM is accessed at two places. 1. To save r4: std r4,STK_PARAM(R4)(r1) 2. To access r4 back: ld r12,STK_PARAM(R4)(r1) HCALL_INST_PRECALL precall allocates a new stack frame. So all the stack parameter access after the precall, needs to be accessed with +STACK_FRAME_MIN_SIZE. So the store instruction should be: std r4,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1) If the "std" is not updated with STACK_FRAME_MIN_SIZE, we will end up with overwriting stack contents and cause corruption. But instead of updating 'std', we can instead remove it since HCALL_INST_PRECALL already saves it to the correct location. similarly load instruction should be: ld r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1) Fix the load instruction to correctly access the stack parameter with +STACK_FRAME_MIN_SIZE and remove the store of r4 since the precall saves it correctly. Cc: sta...@vger.kernel.org Fixes: baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead") Co-developed-by: Naveen N Rao Signed-off-by: Naveen N Rao Signed-off-by: Athira Rajeev --- arch/powerpc/platforms/pseries/hvCall.S | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S index bae45b358a09..2addf2ea03f0 100644 --- a/arch/powerpc/platforms/pseries/hvCall.S +++ b/arch/powerpc/platforms/pseries/hvCall.S @@ -184,7 +184,6 @@ _GLOBAL_TOC(plpar_hcall) plpar_hcall_trace: HCALL_INST_PRECALL(R5) - std r4,STK_PARAM(R4)(r1) mr r0,r4 mr r4,r5 @@ -196,7 +195,7 @@ plpar_hcall_trace: HVSC - ld r12,STK_PARAM(R4)(r1) + ld r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1) std r4,0(r12) std r5,8(r12) std r6,16(r12) @
[PATCH 2/2] powerpc/platforms/pseries: Remove unused r0 in the hcall tracing code
In the plpar_hcall trace code, currently we use r0 to store the ORed result of r4. But this value is not used subsequently in the code. Hence remove this unused save to r0 in plpar_hcall and plpar_hcall9 Suggested-by: Naveen N Rao Signed-off-by: Athira Rajeev --- arch/powerpc/platforms/pseries/hvCall.S | 4 1 file changed, 4 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S index 2addf2ea03f0..2b0cac6fb61f 100644 --- a/arch/powerpc/platforms/pseries/hvCall.S +++ b/arch/powerpc/platforms/pseries/hvCall.S @@ -184,8 +184,6 @@ _GLOBAL_TOC(plpar_hcall) plpar_hcall_trace: HCALL_INST_PRECALL(R5) - mr r0,r4 - mr r4,r5 mr r5,r6 mr r6,r7 @@ -295,8 +293,6 @@ _GLOBAL_TOC(plpar_hcall9) plpar_hcall9_trace: HCALL_INST_PRECALL(R5) - mr r0,r4 - mr r4,r5 mr r5,r6 mr r6,r7 -- 2.39.3
[PATCH V2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
Add rule in new Makefile "tests/Makefile.tests" for running shellcheck on shell test scripts. This automates below shellcheck into the build. $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done Condition for shellcheck is added in Makefile.perf to avoid build breakage in the absence of shellcheck binary. Update Makefile.perf to contain new rule for "SHELLCHECK_TEST" which is for making shellcheck test as a dependency on perf binary. Added "tests/Makefile.tests" to run shellcheck on shellscripts in tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time during make, shellcheck will be run only on modified files during subsequent invocations. By this, if any newly added shell scripts or fixes in existing scripts breaks coding/formatting style, it will get captured during the perf build. Example build failure with present scripts in tests/shell: INSTALL libsubcmd_headers INSTALL libperf_headers INSTALL libapi_headers INSTALL libsymbol_headers INSTALL libbpf_headers make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/record_sideband.dep] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/test_arm_coresight.dep] Error 1 make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/lock_contention.dep] Error 1 make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 make[1]: *** [Makefile.perf:238: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 After this, for testing, changed "tests/shell/record.sh" to break shellcheck format. In the next make run, it is also captured: make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/record_sideband.dep] Error 1 make[3]: *** Waiting for unfinished jobs make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/record.dep] Error 1 make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/test_arm_coresight.dep] Error 1 make[3]: *** [/root/athira/namhyung/perf-tools-next/tools/perf/tests/Makefile.tests:17: output/tests/shell/lock_contention.dep] Error 1 make[2]: *** [Makefile.perf:675: SHELLCHECK_TEST] Error 2 make[1]: *** [Makefile.perf:238: sub-make] Error 2 make: *** [Makefile:70: all] Error 2 Signed-off-by: Athira Rajeev --- Changelog: v1 -> v2: Version1 had shellcheck in feature check which is not required since shellcheck is already a binary. Presence of binary can be checked using: $(shell command -v shellcheck) Addressed these changes as suggested by Namhyung in V2 Feature test logic is removed in V2. Also added example for build breakage when shellcheck fails in commit message tools/perf/Makefile.perf| 14 +- tools/perf/tests/Makefile.tests | 24 2 files changed, 37 insertions(+), 1 deletion(-) create mode 100644 tools/perf/tests/Makefile.tests diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 98604e396ac3..56a66ca253ab 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -667,7 +667,18 @@ $(PERF_IN): prepare FORCE $(PMU_EVENTS_IN): FORCE prepare $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events obj=pmu-events -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) +# Runs shellcheck on perf test shell scripts + +SHELLCHECK := $(shell command -v shellcheck) +ifneq ($(SHELLCHECK),) +SHELLCHECK_TEST: FORCE prepare + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests +else +SHELLCHECK_TEST: + @: +endif + +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \ $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@ @@ -1130,6 +1141,7 @@ bpf-skel-clean: $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS) $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete $(Q)$(RM) $(OUTPUT).config-detected diff --git a/tools/perf/tests/Makefile.tests b/tools/perf/tests/Makefile.tests new file mode 100644 index ..8011e99768a3 --- /dev/null +++ b/tools/perf/
Re: [PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
> On 27-Sep-2023, at 8:25 PM, Athira Rajeev wrote: > > > >> On 27-Sep-2023, at 5:45 AM, Namhyung Kim wrote: >> >> On Thu, Sep 14, 2023 at 10:40 PM Athira Rajeev >> wrote: >>> >>> The testcase "Object code reading" fails in somecases >>> for "fs_something" sub test as below: >>> >>> Reading object code for memory address: 0xc00807f0142c >>> File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>> On file address is: 0x1114cc >>> Objdump command is: objdump -z -d --start-address=0x11142c >>> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>> objdump read too few bytes: 128 >>> test child finished with -1 >>> >>> This can alo be reproduced when running perf record with >>> workload that exercises fs_something() code. In the test >>> setup, this is exercising xfs code since root is xfs. >>> >>> # perf record ./a.out >>> # perf report -v |grep "xfs.ko" >>> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>> 0xc00807de5efc B [k] xlog_cil_commit >>> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>> 0xc00807d5ae18 B [k] xfs_btree_key_offset >>> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>> 0xc00807e11fd4 B [k] 0x00112074 >>> >>> Here addr "0xc00807e11fd4" is not resolved. since this is a >>> kernel module, its offset is from the DSO. Xfs module is loaded >>> at 0xc00807d0 >>> >>> # cat /proc/modules | grep xfs >>> xfs 2228224 3 - Live 0xc00807d0 >>> >>> And size is 0x22. So its loaded between 0xc00807d0 >>> and 0xc00807f2. From objdump, text section is: >>> text 0010f7bc 00a0 2**4 >>> >>> Hence perf captured ip maps to 0x112074 which is: >>> ( ip - start of module ) + a0 >>> >>> This offset 0x112074 falls out .text section which is up to 0x10f7bc >>> In this case for module, the address 0xc00807e11fd4 is pointing >>> to stub instructions. This address range represents the module stubs >>> which is allocated on module load and hence is not part of DSO offset. >>> >>> To address this issue in "object code reading", skip the sample if >>> address falls out of text section and is within the module end. >>> Use the "text_end" member of "struct dso" to do this check. >>> >>> To address this issue in "perf report", exploring an option of >>> having stubs range as part of the /proc/kallsyms, so that perf >>> report can resolve addresses in stubs range >>> >>> However this patch uses text_end to skip the stub range for >>> Object code reading testcase. >>> >>> Reported-by: Disha Goel >>> Signed-off-by: Athira Rajeev >>> Tested-by: Disha Goel >>> Reviewed-by: Adrian Hunter >>> --- >>> Changelog: >>> v3 -> v4: >>> Fixed indent in V3 >>> >>> v2 -> v3: >>> Used strtailcmp in comparison for module check and added Reviewed-by >>> from Adrian, Tested-by from Disha. >>> >>> v1 -> v2: >>> Updated comment to add description on which arch has stub and >>> reason for skipping as suggested by Adrian >>> >>> tools/perf/tests/code-reading.c | 10 ++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/tools/perf/tests/code-reading.c >>> b/tools/perf/tests/code-reading.c >>> index ed3815163d1b..9e6e6c985840 100644 >>> --- a/tools/perf/tests/code-reading.c >>> +++ b/tools/perf/tests/code-reading.c >>> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 >>> cpumode, >>> if (addr + len > map__end(al.map)) >>> len = map__end(al.map) - addr; >>> >>> + /* >>> +* Some architectures (ex: powerpc) have stubs (trampolines) in >>> kernel >>> +* modules to manage long jumps. Check if the ip offset falls in >>> stubs >>> +* sections for kernel modules. And skip module address after text >>> end >>> +*/ >>> + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) { >> >> There's a is_kernel_module() that can check compressed modules >> too but I think we need a simpler way to check it like dso->kernel. >> >> Thanks, >> Namhyung > > Thanks for the comment Namhyung. I will add similar to dso->kernel, another > field check in next version of patchset > > Athira Hi Namhyung, I have posted a V5 for this: https://lore.kernel.org/linux-perf-users/20230928075213.84392-1-atraj...@linux.vnet.ibm.com/T/#t Thanks Athira >> >> >>> + pr_debug("skipping the module address %#"PRIx64" after text >>> end\n", al.addr); >>> + goto out; >>> + } >>> + >>> /* Read the object code using perf */ >>> ret_len = dso__data_read_offset(dso, >>> maps__machine(thread__maps(thread)), >>> al.addr, buf1, len); >>> -- >>> 2.31.1
Re: [PATCH 2/2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 27-Sep-2023, at 9:55 AM, Athira Rajeev wrote: > > > >> On 27-Sep-2023, at 5:25 AM, Namhyung Kim wrote: >> >> On Thu, Sep 14, 2023 at 10:18 AM Athira Rajeev >> wrote: >>> >>> Add rule in new Makefile "tests/Makefile.tests" for running >>> shellcheck on shell test scripts. This automates below shellcheck >>> into the build. >>> >>> $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do >>> shellcheck -S warning $F; done >> >> I think you can do it if $(shell command -v shellcheck) returns >> non-empty string (the path to the shellcheck). Then the feature >> test logic can be gone. > > Ok, I will try this. >> >>> >>> CONFIG_SHELLCHECK check is added to avoid build breakage in >>> the absence of shellcheck binary. Update Makefile.perf to contain >>> new rule for "SHELLCHECK_TEST" which is for making shellcheck >>> test as a dependency on perf binary. Added "tests/Makefile.tests" >>> to run shellcheck on shellscripts in tests/shell. The make rule >>> "SHLLCHECK_RUN" ensures that, every time during make, shellcheck >>> will be run only on modified files during subsequent invocations. >>> By this, if any newly added shell scripts or fixes in existing >>> scripts breaks coding/formatting style, it will get captured >>> during the perf build. >> >> Can you show me the example output? > > Sure, I will add it. >> >>> >>> Signed-off-by: Athira Rajeev >>> --- >>> tools/perf/Makefile.perf| 12 +++- >>> tools/perf/tests/Makefile.tests | 24 >>> 2 files changed, 35 insertions(+), 1 deletion(-) >>> create mode 100644 tools/perf/tests/Makefile.tests >>> >>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >>> index f6fdc2d5a92f..c27f54771e90 100644 >>> --- a/tools/perf/Makefile.perf >>> +++ b/tools/perf/Makefile.perf >>> @@ -667,7 +667,16 @@ $(PERF_IN): prepare FORCE >>> $(PMU_EVENTS_IN): FORCE prepare >>> $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events >>> obj=pmu-events >>> >>> -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) >>> +# Runs shellcheck on perf test shell scripts >>> +ifeq ($(CONFIG_SHELLCHECK),y) >>> +SHELLCHECK_TEST: FORCE prepare >>> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests >>> +else >>> +SHELLCHECK_TEST: >>> + @: >>> +endif >>> + >>> +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST >>> $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \ >>> $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@ >>> >>> @@ -1129,6 +1138,7 @@ bpf-skel-clean: >>> $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) >>> >>> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean >>> $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean >>> bpf-skel-clean tests-coresight-targets-clean >>> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean >>> $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) >>> $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS) >>> $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' >>> -delete -o -name '\.*.d' -delete >>> $(Q)$(RM) $(OUTPUT).config-detected >>> diff --git a/tools/perf/tests/Makefile.tests >>> b/tools/perf/tests/Makefile.tests >>> new file mode 100644 >>> index ..e74575559e83 >>> --- /dev/null >>> +++ b/tools/perf/tests/Makefile.tests >>> @@ -0,0 +1,24 @@ >>> +# SPDX-License-Identifier: GPL-2.0 >>> +# Athira Rajeev , 2023 >>> +-include $(OUTPUT).config-detected >>> + >>> +log_file = $(OUTPUT)shellcheck_test.log >>> +PROGS = $(subst ./,,$(shell find tests/shell -perm -o=x -type f -name >>> '*.sh')) >>> +DEPS = $(addprefix output/,$(addsuffix .dep,$(basename $(PROGS >>> +DIRS = $(shell echo $(dir $(DEPS)) | xargs -n1 | sort -u | xargs) >>> + >>> +.PHONY: all >>> +all: SHELLCHECK_RUN >>> + @: >>> + >>> +SHELLCHECK_RUN: $(DEPS) $(DIRS) >>> + >>> +output/%.dep: %.sh | $(DIRS) >>> + $(call rule_mkdir) >>> + $(Q)$(call frecho-cmd,test)@touch $@ >>> + $(Q)$(call frecho-cmd,test)@shellcheck -S warning $(subst >>> output/,./,$(patsubst %.dep, %.sh, $@)) 1> ${log_file} && ([[ ! -s >>> ${log_file} ]]) >> >> This line is too long, please wrap it with some backslashes. > Ok > > I will address all the comments in next version Hi Namhyung, While working on V2 for the Makefile changes and testing, came across three issues with latest scripts in perf-tools-next. I have addressed those in below patchset: https://lore.kernel.org/linux-perf-users/20230929041133.95355-1-atraj...@linux.vnet.ibm.com/T/#m7b3dc8a96467058e1b392183190baed47ae0eb75 [PATCH 0/3] Fix for shellcheck issues with latest scripts in tests/shell For the Makefile.perf changes, I will send V2 separately addressing review comments Thanks Athira > > Thanks > Athira >> >> Thanks, >> Namhyung >> >> >>> +$(DIRS): >>> + @mkdir -p $@ >>> + >>> +clean: >>> + @rm -rf $(log_file) output >>> -- >>> 2.31.1
[PATCH 3/3] tools/perf/tests: Fix shellcheck warning in record_sideband.sh test
Running shellcheck on record_sideband.sh throws below warning: In tests/shell/record_sideband.sh line 25: if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 >/dev/null ^--^ SC2069: To redirect stdout+stderr, 2>&1 must be last (or use '{ cmd > file; } 2>&1' to clarify). This shows shellcheck warning SC2069 where the redirection order needs to be fixed. Use { cmd > file; } 2>&1 to fix the redirection of perf record output Fixes: 23b97c7ee963 ("perf test: Add test case for record sideband events") Signed-off-by: Athira Rajeev --- tools/perf/tests/shell/record_sideband.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/record_sideband.sh b/tools/perf/tests/shell/record_sideband.sh index 5024a7ce0c51..7e036763a43c 100755 --- a/tools/perf/tests/shell/record_sideband.sh +++ b/tools/perf/tests/shell/record_sideband.sh @@ -22,7 +22,7 @@ trap trap_cleanup EXIT TERM INT can_cpu_wide() { -if ! perf record -o ${perfdata} -BN --no-bpf-event -C $1 true 2>&1 >/dev/null +if ! { perf record -o ${perfdata} -BN --no-bpf-event -C $1 true > /dev/null; } 2>&1 then echo "record sideband test [Skipped cannot record cpu$1]" err=2 -- 2.31.1
[PATCH 2/3] tools/perf/tests Ignore the shellcheck SC2046 warning in lock_contentio
Running shellcheck on lock_contention.sh generates below warning In tests/shell/lock_contention.sh line 36: if [ `nproc` -lt 4 ]; then ^-^ SC2046: Quote this to prevent word splitting. Here since nproc will generate a single word output and there is no possibility of word splitting, this warning can be ignored. Use exception for this with "disable" option in shellcheck. This warning is observed after commit: "commit 29441ab3a30a ("perf test lock_contention.sh: Skip test if not enough CPUs")" Fixes: 29441ab3a30a ("perf test lock_contention.sh: Skip test if not enough CPUs") Signed-off-by: Athira Rajeev --- tools/perf/tests/shell/lock_contention.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/tests/shell/lock_contention.sh b/tools/perf/tests/shell/lock_contention.sh index d5a191d3d090..c1ec5762215b 100755 --- a/tools/perf/tests/shell/lock_contention.sh +++ b/tools/perf/tests/shell/lock_contention.sh @@ -33,6 +33,7 @@ check() { exit fi + # shellcheck disable=SC2046 if [ `nproc` -lt 4 ]; then echo "[Skip] Low number of CPUs (`nproc`), lock event cannot be triggered certainly" err=2 -- 2.31.1
[PATCH 1/3] perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh
Running shellcheck on tests/shell/test_arm_coresight.sh throws below warnings: In tests/shell/test_arm_coresight.sh line 15: cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* -print -quit) ^--^ SC2061: Quote the parameter to -name so the shell won't interpret it. In tests/shell/test_arm_coresight.sh line 20: if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then ^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined This warning is observed after commit: "commit bb350847965d ("perf test: Update cs_etm testcase for Arm ETE")" Fixed this issue by using quoting 'cpu*' for SC2061 and using "&&" in line number 20 for SC2166 warning Fixes: bb350847965d ("perf test: Update cs_etm testcase for Arm ETE") Signed-off-by: Athira Rajeev --- tools/perf/tests/shell/test_arm_coresight.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/tests/shell/test_arm_coresight.sh b/tools/perf/tests/shell/test_arm_coresight.sh index fe78c4626e45..f2115dfa24a5 100755 --- a/tools/perf/tests/shell/test_arm_coresight.sh +++ b/tools/perf/tests/shell/test_arm_coresight.sh @@ -12,12 +12,12 @@ glb_err=0 cs_etm_dev_name() { - cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name cpu* -print -quit) + cs_etm_path=$(find /sys/bus/event_source/devices/cs_etm/ -name 'cpu*' -print -quit) trcdevarch=$(cat ${cs_etm_path}/mgmt/trcdevarch) archhver=$((($trcdevarch >> 12) & 0xf)) archpart=$(($trcdevarch & 0xfff)) - if [ $archhver -eq 5 -a "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then + if [ $archhver -eq 5 ] && [ "$(printf "0x%X\n" $archpart)" = "0xA13" ] ; then echo "ete" else echo "etm" -- 2.31.1
[PATCH 0/3] Fix for shellcheck issues with latest scripts in tests/shell
shellcheck was run on perf tool shell scripts as a pre-requisite to include a build option for shellcheck discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html And fixes were added for the coding/formatting issues in two patchsets: https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ Three additional issues were observed and fixes are part of: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next With recent commits in perf, other three issues are observed. shellcheck version: 0.6.0 With this patchset: for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done echo $? 0 The changes are with recent commits ( which is mentioned in each patch) for ock_contention, record_sideband and test_arm_coresight testcases. The changes are made on top of: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next Athira Rajeev (3): perf tests test_arm_coresight: Fix the shellcheck warning in latest test_arm_coresight.sh tools/perf/tests Ignore the shellcheck SC2046 warning in lock_contentio tools/perf/tests: Fix shellcheck warning in record_sideband.sh test tools/perf/tests/shell/lock_contention.sh| 1 + tools/perf/tests/shell/record_sideband.sh| 2 +- tools/perf/tests/shell/test_arm_coresight.sh | 4 ++-- 3 files changed, 4 insertions(+), 3 deletions(-) -- 2.31.1
[PATCH V5 3/3] tools/perf/tests: Fix object code reading to skip address that falls out of text section
The testcase "Object code reading" fails in somecases for "fs_something" sub test as below: Reading object code for memory address: 0xc00807f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 This can alo be reproduced when running perf record with workload that exercises fs_something() code. In the test setup, this is exercising xfs code since root is xfs. # perf record ./a.out # perf report -v |grep "xfs.ko" 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807de5efc B [k] xlog_cil_commit 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807d5ae18 B [k] xfs_btree_key_offset 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807e11fd4 B [k] 0x00112074 Here addr "0xc00807e11fd4" is not resolved. since this is a kernel module, its offset is from the DSO. Xfs module is loaded at 0xc00807d0 # cat /proc/modules | grep xfs xfs 2228224 3 - Live 0xc00807d0 And size is 0x22. So its loaded between 0xc00807d0 and 0xc00807f2. From objdump, text section is: text 0010f7bc 00a0 2**4 Hence perf captured ip maps to 0x112074 which is: ( ip - start of module ) + a0 This offset 0x112074 falls out .text section which is up to 0x10f7bc In this case for module, the address 0xc00807e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To address this issue in "object code reading", skip the sample if address falls out of text section and is within the module end. Use the "text_end" member of "struct dso" to do this check. To address this issue in "perf report", exploring an option of having stubs range as part of the /proc/kallsyms, so that perf report can resolve addresses in stubs range However this patch uses text_end to skip the stub range for Object code reading testcase. Reported-by: Disha Goel Signed-off-by: Athira Rajeev Tested-by: Disha Goel Reviewed-by: Adrian Hunter Reviewed-by: Kajol Jain --- Changelog: v4 -> v5: Used dso->is_kmod to check if the dso is a kernel module v3 -> v4: Fixed indent in V3 v2 -> v3: Used strtailcmp in comparison for module check and added Reviewed-by from Adrian, Tested-by from Disha. v1 -> v2: Updated comment to add description on which arch has stub and reason for skipping as suggested by Adrian tools/perf/tests/code-reading.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c index ed3815163d1b..3af81012014e 100644 --- a/tools/perf/tests/code-reading.c +++ b/tools/perf/tests/code-reading.c @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode, if (addr + len > map__end(al.map)) len = map__end(al.map) - addr; + /* +* Some architectures (ex: powerpc) have stubs (trampolines) in kernel +* modules to manage long jumps. Check if the ip offset falls in stubs +* sections for kernel modules. And skip module address after text end +*/ + if (dso->is_kmod && al.addr > dso->text_end) { + pr_debug("skipping the module address %#"PRIx64" after text end\n", al.addr); + goto out; + } + /* Read the object code using perf */ ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)), al.addr, buf1, len); -- 2.31.1
[PATCH V5 2/3] tools/perf: Add "is_kmod" to struct dso to check if it is kernel module
Update "struct dso" to include new member "is_kmod". This new field will determine if the file is a kernel module or not. To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. This was happening because in some cases for kernel modules, address from sample points to stub instructions. To identify if the DSO is a kernel module, the new field "is_kmod" is added to "struct dso". Reported-by: Disha Goel Signed-off-by: Athira Rajeev --- Changelog: v5: This patch adds is_kmod field to dso to detect if the dso is a kernel module tools/perf/util/dso.c | 2 ++ tools/perf/util/dso.h | 1 + 2 files changed, 3 insertions(+) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index bdfead36b83a..1f629b6fb7cf 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -477,6 +477,7 @@ void dso__set_module_info(struct dso *dso, struct kmod_path *m, dso->comp = m->comp; } + dso->is_kmod = 1; dso__set_short_name(dso, strdup(m->name), true); } @@ -1338,6 +1339,7 @@ struct dso *dso__new_id(const char *name, struct dso_id *id) dso->has_srcline = 1; dso->a2l_fails = 1; dso->kernel = DSO_SPACE__USER; + dso->is_kmod = 0; dso->needs_swap = DSO_SWAP__UNSET; dso->comp = COMP_ID__NONE; RB_CLEAR_NODE(>rb_node); diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index 70fe0fe69bef..3759de8c2267 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -162,6 +162,7 @@ struct dso { char *symsrc_filename; unsigned int a2l_fails; enum dso_space_type kernel; + boolis_kmod; enum dso_swap_type needs_swap; enum dso_binary_typesymtab_type; enum dso_binary_typebinary_type; -- 2.31.1
[PATCH V5 1/3] tools/perf: Add text_end to "struct dso" to save .text section size
Update "struct dso" to include new member "text_end". This new field will represent the offset for end of text section for a dso. For elf, this value is derived as: sh_size (Size of section in byes) + sh_offset (Section file offst) of the elf header for text. For bfd, this value is derived as: 1. For PE file, section->size + ( section->vma - dso->text_offset) 2. Other cases: section->filepos (file position) + section->size (size of section) To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. Example: Reading object code for memory address: 0xc00807f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 Here, module is loaded at: # cat /proc/modules | grep xfs xfs 2228224 3 - Live 0xc00807d0 >From objdump for xfs module, text section is: text 0010f7bc 00a0 2**4 Here the offset for 0xc00807f0142c ie 0x112074 falls out .text section which is up to 0x10f7bc. In this case for module, the address 0xc00807e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To identify such address, which falls out of text section and within module end, added the new field "text_end" to "struct dso". Reported-by: Disha Goel Signed-off-by: Athira Rajeev Reviewed-by: Adrian Hunter Reviewed-by: Kajol Jain --- Changelog: v2 -> v3: Added Reviewed-by from Adrian v1 -> v2: Added text_end for bfd also by updating dso__load_bfd_symbols as suggested by Adrian. tools/perf/util/dso.h| 1 + tools/perf/util/symbol-elf.c | 4 +++- tools/perf/util/symbol.c | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index b41c9782c754..70fe0fe69bef 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -181,6 +181,7 @@ struct dso { u8 rel; struct build_id bid; u64 text_offset; + u64 text_end; const char *short_name; const char *long_name; u16 long_name_len; diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 95e99c332d7e..9e7eeaf616b8 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss, } if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, , - ".text", NULL)) + ".text", NULL)) { dso->text_offset = tshdr.sh_addr - tshdr.sh_offset; + dso->text_end = tshdr.sh_offset + tshdr.sh_size; + } if (runtime_ss->opdsec) opddata = elf_rawdata(runtime_ss->opdsec, NULL); diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 3f36675b7c8f..f25e4e62cf25 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char *debugfile) /* PE symbols can only have 4 bytes, so use .text high bits */ dso->text_offset = section->vma - (u32)section->vma; dso->text_offset += (u32)bfd_asymbol_value(symbols[i]); + dso->text_end = (section->vma - dso->text_offset) + section->size; } else { dso->text_offset = section->vma - section->filepos; + dso->text_end = section->filepos + section->size; } } -- 2.31.1
[PATCH V3] perf test: Fix parse-events tests to skip parametrized events
Testcase "Parsing of all PMU events from sysfs" parse events for all PMUs, and not just cpu. In case of powerpc, the PowerVM environment supports events from hv_24x7 and hv_gpci PMU which is of example format like below: - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/ - hv_gpci/event,partition_id=?/ The value for "?" needs to be filled in depending on system configuration. It is better to skip these parametrized events in this test as it is done in: 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip parametrized events")' which handled a simialr instance with "all PMU test". Fix parse-events test to skip parametrized events since it needs proper setup of the parameters. Signed-off-by: Athira Rajeev Tested-by: Ian Rogers Tested-by: Sachin Sant Reviewed-by: Kajol Jain --- Changelog: v2 -> v3: Addressed review comments from Namhyung by closing the file if getline fails. v1 -> v2: Addressed review comments from Ian. Updated size of pmu event name variable and changed bool name which is used to skip the test. tools/perf/tests/parse-events.c | 39 + 1 file changed, 39 insertions(+) diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c index d47f1f871164..2b66ffba3bb0 100644 --- a/tools/perf/tests/parse-events.c +++ b/tools/perf/tests/parse-events.c @@ -2514,9 +2514,14 @@ static int test__pmu_events(struct test_suite *test __maybe_unused, int subtest while ((pmu = perf_pmus__scan(pmu)) != NULL) { struct stat st; char path[PATH_MAX]; + char pmu_event[PATH_MAX]; + char *buf = NULL; + FILE *file; struct dirent *ent; + size_t len = 0; DIR *dir; int err; + int n; snprintf(path, PATH_MAX, "%s/bus/event_source/devices/%s/events/", sysfs__mountpoint(), pmu->name); @@ -2538,11 +2543,45 @@ static int test__pmu_events(struct test_suite *test __maybe_unused, int subtest struct evlist_test e = { .name = NULL, }; char name[2 * NAME_MAX + 1 + 12 + 3]; int test_ret; + bool is_event_parameterized = 0; /* Names containing . are special and cannot be used directly */ if (strchr(ent->d_name, '.')) continue; + /* exclude parametrized ones (name contains '?') */ + n = snprintf(pmu_event, sizeof(pmu_event), "%s%s", path, ent->d_name); + if (n >= PATH_MAX) { + pr_err("pmu event name crossed PATH_MAX(%d) size\n", PATH_MAX); + continue; + } + + file = fopen(pmu_event, "r"); + if (!file) { + pr_debug("can't open pmu event file for '%s'\n", ent->d_name); + ret = combine_test_results(ret, TEST_FAIL); + continue; + } + + if (getline(, , file) < 0) { + pr_debug(" pmu event: %s is a null event\n", ent->d_name); + ret = combine_test_results(ret, TEST_FAIL); + fclose(file); + continue; + } + + if (strchr(buf, '?')) + is_event_parameterized = 1; + + free(buf); + buf = NULL; + fclose(file); + + if (is_event_parameterized == 1) { + pr_debug("skipping parametrized PMU event: %s which contains ?\n", pmu_event); + continue; + } + snprintf(name, sizeof(name), "%s/event=%s/u", pmu->name, ent->d_name); e.name = name; -- 2.31.1
Re: [PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
> On 27-Sep-2023, at 5:45 AM, Namhyung Kim wrote: > > On Thu, Sep 14, 2023 at 10:40 PM Athira Rajeev > wrote: >> >> The testcase "Object code reading" fails in somecases >> for "fs_something" sub test as below: >> >>Reading object code for memory address: 0xc00807f0142c >>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>On file address is: 0x1114cc >>Objdump command is: objdump -z -d --start-address=0x11142c >> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>objdump read too few bytes: 128 >>test child finished with -1 >> >> This can alo be reproduced when running perf record with >> workload that exercises fs_something() code. In the test >> setup, this is exercising xfs code since root is xfs. >> >># perf record ./a.out >># perf report -v |grep "xfs.ko" >> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807de5efc B [k] xlog_cil_commit >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807d5ae18 B [k] xfs_btree_key_offset >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807e11fd4 B [k] 0x00112074 >> >> Here addr "0xc00807e11fd4" is not resolved. since this is a >> kernel module, its offset is from the DSO. Xfs module is loaded >> at 0xc00807d0 >> >> # cat /proc/modules | grep xfs >>xfs 2228224 3 - Live 0xc00807d0 >> >> And size is 0x22. So its loaded between 0xc00807d0 >> and 0xc00807f2. From objdump, text section is: >>text 0010f7bc 00a0 2**4 >> >> Hence perf captured ip maps to 0x112074 which is: >> ( ip - start of module ) + a0 >> >> This offset 0x112074 falls out .text section which is up to 0x10f7bc >> In this case for module, the address 0xc00807e11fd4 is pointing >> to stub instructions. This address range represents the module stubs >> which is allocated on module load and hence is not part of DSO offset. >> >> To address this issue in "object code reading", skip the sample if >> address falls out of text section and is within the module end. >> Use the "text_end" member of "struct dso" to do this check. >> >> To address this issue in "perf report", exploring an option of >> having stubs range as part of the /proc/kallsyms, so that perf >> report can resolve addresses in stubs range >> >> However this patch uses text_end to skip the stub range for >> Object code reading testcase. >> >> Reported-by: Disha Goel >> Signed-off-by: Athira Rajeev >> Tested-by: Disha Goel >> Reviewed-by: Adrian Hunter >> --- >> Changelog: >> v3 -> v4: >> Fixed indent in V3 >> >> v2 -> v3: >> Used strtailcmp in comparison for module check and added Reviewed-by >> from Adrian, Tested-by from Disha. >> >> v1 -> v2: >> Updated comment to add description on which arch has stub and >> reason for skipping as suggested by Adrian >> >> tools/perf/tests/code-reading.c | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/tools/perf/tests/code-reading.c >> b/tools/perf/tests/code-reading.c >> index ed3815163d1b..9e6e6c985840 100644 >> --- a/tools/perf/tests/code-reading.c >> +++ b/tools/perf/tests/code-reading.c >> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 >> cpumode, >>if (addr + len > map__end(al.map)) >>len = map__end(al.map) - addr; >> >> + /* >> +* Some architectures (ex: powerpc) have stubs (trampolines) in >> kernel >> +* modules to manage long jumps. Check if the ip offset falls in >> stubs >> +* sections for kernel modules. And skip module address after text >> end >> +*/ >> + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) { > > There's a is_kernel_module() that can check compressed modules > too but I think we need a simpler way to check it like dso->kernel. > > Thanks, > Namhyung Thanks for the comment Namhyung. I will add similar to dso->kernel, another field check in next version of patchset Athira > > >> + pr_debug("skipping the module address %#"PRIx64" after text >> end\n", al.addr); >> + goto out; >> + } >> + >>/* Read the object code using perf */ >>ret_len = dso__data_read_offset(dso, >> maps__machine(thread__maps(thread)), >>al.addr, buf1, len); >> -- >> 2.31.1
Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events
> On 27-Sep-2023, at 4:07 AM, Namhyung Kim wrote: > > Hello, > > On Mon, Sep 25, 2023 at 10:37 AM Arnaldo Carvalho de Melo > wrote: >> >> >> >> On Wed, Sep 13, 2023, 7:40 AM Athira Rajeev >> wrote: >>> >>> >>> >>>> On 08-Sep-2023, at 7:48 PM, Athira Rajeev >>>> wrote: >>>> >>>> >>>> >>>>> On 08-Sep-2023, at 11:04 AM, Sachin Sant wrote: >>>>> >>>>> >>>>> >>>>>> On 07-Sep-2023, at 10:29 PM, Athira Rajeev >>>>>> wrote: >>>>>> >>>>>> Testcase "Parsing of all PMU events from sysfs" parse events for >>>>>> all PMUs, and not just cpu. In case of powerpc, the PowerVM >>>>>> environment supports events from hv_24x7 and hv_gpci PMU which >>>>>> is of example format like below: >>>>>> >>>>>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/ >>>>>> - hv_gpci/event,partition_id=?/ >>>>>> >>>>>> The value for "?" needs to be filled in depending on system >>>>>> configuration. It is better to skip these parametrized events >>>>>> in this test as it is done in: >>>>>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip >>>>>> parametrized events")' which handled a simialr instance with >>>>>> "all PMU test". >>>>>> >>>>>> Fix parse-events test to skip parametrized events since >>>>>> it needs proper setup of the parameters. >>>>>> >>>>>> Signed-off-by: Athira Rajeev >>>>>> --- >>>>>> Changelog: >>>>>> v1 -> v2: >>>>>> Addressed review comments from Ian. Updated size of >>>>>> pmu event name variable and changed bool name which is >>>>>> used to skip the test. >>>>>> >>>>> >>>>> The patch fixes the reported issue. >>>>> >>>>> 6.2: Parsing of all PMU events from sysfs : Ok >>>>> 6.3: Parsing of given PMU events from sysfs: Ok >>>>> >>>>> Tested-by: Sachin Sant >>>>> >>>>> - Sachin >>>> >>>> Hi Sachin, Ian >>>> >>>> Thanks for testing the patch >>> >>> Hi Arnaldo >>> >>> Can you please check and pull this if it looks good to go . >> >> >> Namhyung, can you please take a look? > > Yep sure. I think it needs to close the file when getline() fails. > > Athira, can you please send v3 with that? Sure, I will post V3 with this change Athira > > Thanks, > Namhyung
Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"
> On 25-Sep-2023, at 1:34 PM, kajoljain wrote: > > > > On 9/7/23 22:45, Athira Rajeev wrote: >> From: root >> >> shellcheck was run on perf tool shell scripts s a pre-requisite >> to include a build option for shellcheck discussed here: >> https://www.spinics.net/lists/linux-perf-users/msg25553.html >> >> And fixes were added for the coding/formatting issues in >> two patchsets: >> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ >> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ >> >> Three additional issues are observed with shellcheck "0.6" and >> this patchset covers those. With this patchset, >> >> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >> warning $F; done >> # echo $? >> 0 >> > > Patchset looks good to me. > > Reviewed-by: Kajol Jain > > Thanks, > Kajol Jain > Hi Namhyunbg, Can you please check for this patchset also Thanks Athira >> Athira Rajeev (3): >> tests/shell: Fix shellcheck SC1090 to handle the location of sourced >>files >> tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh >>tetscase >> tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts >> >> tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 >> tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 >> tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 >> tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh | 4 >> tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 >> tools/perf/tests/shell/probe_vfs_getname.sh | 2 ++ >> tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++ >> tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++ >> tools/perf/tests/shell/record.sh | 1 + >> tools/perf/tests/shell/stat+csv_output.sh| 1 + >> tools/perf/tests/shell/stat+csv_summary.sh | 4 ++-- >> tools/perf/tests/shell/stat+shadow_stat.sh | 4 ++-- >> tools/perf/tests/shell/stat+std_output.sh| 1 + >> tools/perf/tests/shell/test_intel_pt.sh | 1 + >> tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 + >> 15 files changed, 35 insertions(+), 4 deletions(-)
Re: [PATCH 2/2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
> On 27-Sep-2023, at 5:25 AM, Namhyung Kim wrote: > > On Thu, Sep 14, 2023 at 10:18 AM Athira Rajeev > wrote: >> >> Add rule in new Makefile "tests/Makefile.tests" for running >> shellcheck on shell test scripts. This automates below shellcheck >> into the build. >> >>$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do >> shellcheck -S warning $F; done > > I think you can do it if $(shell command -v shellcheck) returns > non-empty string (the path to the shellcheck). Then the feature > test logic can be gone. Ok, I will try this. > >> >> CONFIG_SHELLCHECK check is added to avoid build breakage in >> the absence of shellcheck binary. Update Makefile.perf to contain >> new rule for "SHELLCHECK_TEST" which is for making shellcheck >> test as a dependency on perf binary. Added "tests/Makefile.tests" >> to run shellcheck on shellscripts in tests/shell. The make rule >> "SHLLCHECK_RUN" ensures that, every time during make, shellcheck >> will be run only on modified files during subsequent invocations. >> By this, if any newly added shell scripts or fixes in existing >> scripts breaks coding/formatting style, it will get captured >> during the perf build. > > Can you show me the example output? Sure, I will add it. > >> >> Signed-off-by: Athira Rajeev >> --- >> tools/perf/Makefile.perf| 12 +++- >> tools/perf/tests/Makefile.tests | 24 >> 2 files changed, 35 insertions(+), 1 deletion(-) >> create mode 100644 tools/perf/tests/Makefile.tests >> >> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >> index f6fdc2d5a92f..c27f54771e90 100644 >> --- a/tools/perf/Makefile.perf >> +++ b/tools/perf/Makefile.perf >> @@ -667,7 +667,16 @@ $(PERF_IN): prepare FORCE >> $(PMU_EVENTS_IN): FORCE prepare >>$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events >> obj=pmu-events >> >> -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) >> +# Runs shellcheck on perf test shell scripts >> +ifeq ($(CONFIG_SHELLCHECK),y) >> +SHELLCHECK_TEST: FORCE prepare >> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests >> +else >> +SHELLCHECK_TEST: >> + @: >> +endif >> + >> +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST >>$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \ >>$(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@ >> >> @@ -1129,6 +1138,7 @@ bpf-skel-clean: >>$(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) >> >> clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean >> $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean >> tests-coresight-targets-clean >> + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean >>$(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) >> $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS) >> $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' >> -delete -o -name '\.*.d' -delete >>$(Q)$(RM) $(OUTPUT).config-detected >> diff --git a/tools/perf/tests/Makefile.tests >> b/tools/perf/tests/Makefile.tests >> new file mode 100644 >> index ..e74575559e83 >> --- /dev/null >> +++ b/tools/perf/tests/Makefile.tests >> @@ -0,0 +1,24 @@ >> +# SPDX-License-Identifier: GPL-2.0 >> +# Athira Rajeev , 2023 >> +-include $(OUTPUT).config-detected >> + >> +log_file = $(OUTPUT)shellcheck_test.log >> +PROGS = $(subst ./,,$(shell find tests/shell -perm -o=x -type f -name >> '*.sh')) >> +DEPS = $(addprefix output/,$(addsuffix .dep,$(basename $(PROGS >> +DIRS = $(shell echo $(dir $(DEPS)) | xargs -n1 | sort -u | xargs) >> + >> +.PHONY: all >> +all: SHELLCHECK_RUN >> + @: >> + >> +SHELLCHECK_RUN: $(DEPS) $(DIRS) >> + >> +output/%.dep: %.sh | $(DIRS) >> + $(call rule_mkdir) >> + $(Q)$(call frecho-cmd,test)@touch $@ >> + $(Q)$(call frecho-cmd,test)@shellcheck -S warning $(subst >> output/,./,$(patsubst %.dep, %.sh, $@)) 1> ${log_file} && ([[ ! -s >> ${log_file} ]]) > > This line is too long, please wrap it with some backslashes. Ok I will address all the comments in next version Thanks Athira > > Thanks, > Namhyung > > >> +$(DIRS): >> + @mkdir -p $@ >> + >> +clean: >> + @rm -rf $(log_file) output >> -- >> 2.31.1
Re: [PATCH 1/2] tools/perf: Add new CONFIG_SHELLCHECK for detecting shellcheck binary
> On 27-Sep-2023, at 5:21 AM, Namhyung Kim wrote: > > Hello, > > On Thu, Sep 14, 2023 at 10:18 AM Athira Rajeev > wrote: >> >> shellcheck tool can detect coding/formatting issues on >> shell scripts. In perf directory "tests/shell", there are lot >> of shell test scripts and this tool can detect coding/formatting >> issues on these scripts. >> >> Example to use shellcheck for severity level for >> errors and warnings, below command is used: >> >> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >> warning $F; done >> # echo $? >> 0 >> >> This testing needs to be automated into the build so that it >> can avoid regressions and also run the check for newly added >> during build test itself. Add a new feature check to detect >> presence of shellcheck. Add CONFIG_SHELLCHECK feature check in >> the build to avoid not having shellcheck breaking the build. >> >> Signed-off-by: Athira Rajeev >> --- >> tools/build/Makefile.feature | 6 -- >> tools/build/feature/Makefile | 8 +++- >> tools/perf/Makefile.config | 10 ++ >> 3 files changed, 21 insertions(+), 3 deletions(-) >> >> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature >> index 934e2777a2db..23f56b95babf 100644 >> --- a/tools/build/Makefile.feature >> +++ b/tools/build/Makefile.feature >> @@ -72,7 +72,8 @@ FEATURE_TESTS_BASIC := \ >> libzstd\ >> disassembler-four-args \ >> disassembler-init-styled \ >> -file-handle >> +file-handle\ >> +shellcheck >> >> # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list >> # of all feature tests >> @@ -138,7 +139,8 @@ FEATURE_DISPLAY ?= \ >> get_cpuid \ >> bpf \ >> libaio\ >> - libzstd >> + libzstd \ >> + shellcheck >> >> # >> # Declare group members of a feature to display the logical OR of the >> detection >> diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile >> index 3184f387990a..44ba6d0c98d0 100644 >> --- a/tools/build/feature/Makefile >> +++ b/tools/build/feature/Makefile >> @@ -76,7 +76,8 @@ FILES= \ >> test-libzstd.bin \ >> test-clang-bpf-co-re.bin \ >> test-file-handle.bin \ >> - test-libpfm4.bin >> + test-libpfm4.bin \ >> + test-shellcheck.bin >> >> FILES := $(addprefix $(OUTPUT),$(FILES)) >> >> @@ -92,6 +93,8 @@ __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ >> $(patsubst %.bin,%.c,$(@F)) $( >> __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst >> %.bin,%.cpp,$(@F)) $(LDFLAGS) >> BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1 >> >> + BUILD_BINARY = sh -c $1 > $(@:.bin=.make.output) 2>&1 >> + >> ### >> >> $(OUTPUT)test-all.bin: >> @@ -207,6 +210,9 @@ $(OUTPUT)test-libslang-include-subdir.bin: >> $(OUTPUT)test-libtraceevent.bin: >>$(BUILD) -ltraceevent >> >> +$(OUTPUT)test-shellcheck.bin: >> + $(BUILD_BINARY) "shellcheck --version" > > I don't think it'd generate the .bin file. > > Anyway, it's a binary file already. Can we check it with > `command -v` and get rid of the feature test? > > Thanks, > Namhyung Hi Namhyung, Thanks for the review. Sure, I will check on this Athira > > >> + >> $(OUTPUT)test-libtracefs.bin: >> $(BUILD) $(shell $(PKG_CONFIG) --cflags libtraceevent 2>/dev/null) >> -ltracefs >> >> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config >> index d66b52407e19..e71fe95ad865 100644 >> --- a/tools/perf/Makefile.config >> +++ b/tools/perf/Makefile.config >> @@ -779,6 +779,16 @@ ifndef NO_SLANG >> endif >> endif >> >> +ifneq ($(NO_SHELLCHECK),1) >> + $(call feature_check,shellcheck) >> + ifneq ($(feature-shellcheck), 1) >> +msg := $(warning No shellcheck found. please install ShellCheck); >> + else >> +$(call detected,CONFIG_SHELLCHECK) >> +NO_SHELLCHECK := 0 >> + endif >> +endif >> + >> ifdef GTK2 >> FLAGS_GTK2=$(CFLAGS) $(LDFLAGS) $(EXTLIBS) $(shell $(PKG_CONFIG) --libs >> --cflags gtk+-2.0 2>/dev/null) >> $(call feature_check,gtk2) >> -- >> 2.31.1
Re: [PATCH 1/3] core/device: Add function to return child node using name at substring "@"
> On 18-Sep-2023, at 7:42 PM, Reza Arbab wrote: > > On Thu, Sep 14, 2023 at 10:02:04PM +0530, Athira Rajeev wrote: >> Add a function dt_find_by_name_before_addr() that returns the child node if >> it matches till first occurrence at "@" of a given name, otherwise NULL. >> This is helpful for cases with node name like: "name@addr". In >> scenarios where nodes are added with "name@addr" format and if the >> value of "addr" is not known, that node can't be matched with node >> name or addr. Hence matching with substring as node name will return >> the expected result. Patch adds dt_find_by_name_before_addr() function >> and testcase for the same in core/test/run-device.c > > Series applied to skiboot master with the fixup we discussed. > > -- > Reza Arbab Thanks Reza for picking up the patchset Athira
Re: [PATCH 1/3] core/device: Add function to return child node using name at substring "@"
> On 15-Sep-2023, at 8:00 PM, Reza Arbab wrote: > > Hi Athira, > > On Thu, Sep 14, 2023 at 10:02:04PM +0530, Athira Rajeev wrote: >> +struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const >> char *name) >> +{ >> + struct dt_node *child, *match; >> + char *child_node = NULL; >> + >> + list_for_each(>children, child, list) { >> + child_node = strdup(child->name); >> + if (!child_node) >> + goto err; >> + child_node = strtok(child_node, "@"); >> + if (!strcmp(child_node, name)) { >> + free(child_node); >> + return child; >> + } >> + >> + match = dt_find_by_name_before_addr(child, name); >> + if (match) >> + return match; > > When the function returns on this line, child_node is not freed. > >> + } >> + >> + free(child_node); >> +err: >> + return NULL; >> +} > > I took at stab at moving free(child_node) inside the loop, and ended up with > this: > > struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const char > *name) > { > struct dt_node *child, *match = NULL; > char *child_name = NULL; > > list_for_each(>children, child, list) { > child_name = strdup(child->name); > if (!child_name) > return NULL; > > child_name = strtok(child_name, "@"); > if (!strcmp(child_name, name)) > match = child; > else > match = dt_find_by_name_before_addr(child, name); > > free(child_name); > if (match) > return match; > } > > return NULL; > } > > Does this seem okay to you? If you agree, no need to send another revision, I > can just fixup during commit. Let me know. Hi Reza, Sure, Change looks good. Thanks for the change and fixup. Thanks Athira > >> diff --git a/core/test/run-device.c b/core/test/run-device.c >> index 4a12382bb..fb7a7d2c0 100644 >> --- a/core/test/run-device.c >> +++ b/core/test/run-device.c >> @@ -466,6 +466,20 @@ int main(void) >> new_prop_ph = dt_prop_get_u32(ut2, "something"); >> assert(!(new_prop_ph == ev1_ph)); >> dt_free(subtree); >> + >> + /* Test dt_find_by_name_before_addr */ >> + root = dt_new_root(""); >> + addr1 = dt_new_addr(root, "node", 0x1); >> + addr2 = dt_new_addr(root, "node0_1", 0x2); >> + assert(dt_find_by_name(root, "node@1") == addr1); >> + assert(dt_find_by_name(root, "node0_1@2") == addr2); >> + assert(dt_find_by_name_before_addr(root, "node") == addr1); >> + assert(dt_find_by_name_before_addr(root, "node0_") == NULL); > > This line appears twice. As above, can fix during commit, so no need for a > new patch. > >> + assert(dt_find_by_name_before_addr(root, "node0_1") == addr2); >> + assert(dt_find_by_name_before_addr(root, "node0") == NULL); >> + assert(dt_find_by_name_before_addr(root, "node0_") == NULL); >> + dt_free(root); >> + >> return 0; >> } >> > > -- > Reza Arbab
Re: [PATCH V3 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
> On 15-Sep-2023, at 10:56 AM, Adrian Hunter wrote: > > On 15/09/23 08:24, Athira Rajeev wrote: >> The testcase "Object code reading" fails in somecases >> for "fs_something" sub test as below: >> >>Reading object code for memory address: 0xc00807f0142c >>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>On file address is: 0x1114cc >>Objdump command is: objdump -z -d --start-address=0x11142c >> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>objdump read too few bytes: 128 >>test child finished with -1 >> >> This can alo be reproduced when running perf record with >> workload that exercises fs_something() code. In the test >> setup, this is exercising xfs code since root is xfs. >> >># perf record ./a.out >># perf report -v |grep "xfs.ko" >> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807de5efc B [k] xlog_cil_commit >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807d5ae18 B [k] xfs_btree_key_offset >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807e11fd4 B [k] 0x00112074 >> >> Here addr "0xc00807e11fd4" is not resolved. since this is a >> kernel module, its offset is from the DSO. Xfs module is loaded >> at 0xc00807d0 >> >> # cat /proc/modules | grep xfs >>xfs 2228224 3 - Live 0xc00807d0 >> >> And size is 0x22. So its loaded between 0xc00807d0 >> and 0xc00807f2. From objdump, text section is: >>text 0010f7bc 00a0 2**4 >> >> Hence perf captured ip maps to 0x112074 which is: >> ( ip - start of module ) + a0 >> >> This offset 0x112074 falls out .text section which is up to 0x10f7bc >> In this case for module, the address 0xc00807e11fd4 is pointing >> to stub instructions. This address range represents the module stubs >> which is allocated on module load and hence is not part of DSO offset. >> >> To address this issue in "object code reading", skip the sample if >> address falls out of text section and is within the module end. >> Use the "text_end" member of "struct dso" to do this check. >> >> To address this issue in "perf report", exploring an option of >> having stubs range as part of the /proc/kallsyms, so that perf >> report can resolve addresses in stubs range >> >> However this patch uses text_end to skip the stub range for >> Object code reading testcase. >> >> Reported-by: Disha Goel >> Signed-off-by: Athira Rajeev >> Tested-by: Disha Goel >> Reviewed-by: Adrian Hunter >> --- >> Changelog: >> v2 -> v3: >> Used strtailcmp in comparison for module check and added Reviewed-by >> from Adrian, Tested-by from Disha. >> >> v1 -> v2: >> Updated comment to add description on which arch has stub and >> reason for skipping as suggested by Adrian >> >> tools/perf/tests/code-reading.c | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/tools/perf/tests/code-reading.c >> b/tools/perf/tests/code-reading.c >> index ed3815163d1b..45334d26058e 100644 >> --- a/tools/perf/tests/code-reading.c >> +++ b/tools/perf/tests/code-reading.c >> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 >> cpumode, >> if (addr + len > map__end(al.map)) >> len = map__end(al.map) - addr; >> >> + /* >> + * Some architectures (ex: powerpc) have stubs (trampolines) in kernel >> + * modules to manage long jumps. Check if the ip offset falls in stubs >> + * sections for kernel modules. And skip module address after text end >> + */ >> + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) { >> + pr_debug("skipping the module address %#"PRIx64" after text end\n", >> al.addr); >> + goto out; > > Double indent My bad, addressed in V4 Athira > >> + } >> + >> /* Read the object code using perf */ >> ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)), >> al.addr, buf1, len);
[PATCH V4 1/2] tools/perf: Add text_end to "struct dso" to save .text section size
Update "struct dso" to include new member "text_end". This new field will represent the offset for end of text section for a dso. For elf, this value is derived as: sh_size (Size of section in byes) + sh_offset (Section file offst) of the elf header for text. For bfd, this value is derived as: 1. For PE file, section->size + ( section->vma - dso->text_offset) 2. Other cases: section->filepos (file position) + section->size (size of section) To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. Example: Reading object code for memory address: 0xc00807f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 Here, module is loaded at: # cat /proc/modules | grep xfs xfs 2228224 3 - Live 0xc00807d0 >From objdump for xfs module, text section is: text 0010f7bc 00a0 2**4 Here the offset for 0xc00807f0142c ie 0x112074 falls out .text section which is up to 0x10f7bc. In this case for module, the address 0xc00807e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To identify such address, which falls out of text section and within module end, added the new field "text_end" to "struct dso". Reported-by: Disha Goel Signed-off-by: Athira Rajeev Reviewed-by: Adrian Hunter --- Changelog: v2 -> v3: Added Reviewed-by from Adrian v1 -> v2: Added text_end for bfd also by updating dso__load_bfd_symbols as suggested by Adrian. tools/perf/util/dso.h| 1 + tools/perf/util/symbol-elf.c | 4 +++- tools/perf/util/symbol.c | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index b41c9782c754..70fe0fe69bef 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -181,6 +181,7 @@ struct dso { u8 rel; struct build_id bid; u64 text_offset; + u64 text_end; const char *short_name; const char *long_name; u16 long_name_len; diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 95e99c332d7e..9e7eeaf616b8 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss, } if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, , - ".text", NULL)) + ".text", NULL)) { dso->text_offset = tshdr.sh_addr - tshdr.sh_offset; + dso->text_end = tshdr.sh_offset + tshdr.sh_size; + } if (runtime_ss->opdsec) opddata = elf_rawdata(runtime_ss->opdsec, NULL); diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 3f36675b7c8f..f25e4e62cf25 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char *debugfile) /* PE symbols can only have 4 bytes, so use .text high bits */ dso->text_offset = section->vma - (u32)section->vma; dso->text_offset += (u32)bfd_asymbol_value(symbols[i]); + dso->text_end = (section->vma - dso->text_offset) + section->size; } else { dso->text_offset = section->vma - section->filepos; + dso->text_end = section->filepos + section->size; } } -- 2.31.1
[PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
The testcase "Object code reading" fails in somecases for "fs_something" sub test as below: Reading object code for memory address: 0xc00807f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 This can alo be reproduced when running perf record with workload that exercises fs_something() code. In the test setup, this is exercising xfs code since root is xfs. # perf record ./a.out # perf report -v |grep "xfs.ko" 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807de5efc B [k] xlog_cil_commit 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807d5ae18 B [k] xfs_btree_key_offset 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807e11fd4 B [k] 0x00112074 Here addr "0xc00807e11fd4" is not resolved. since this is a kernel module, its offset is from the DSO. Xfs module is loaded at 0xc00807d0 # cat /proc/modules | grep xfs xfs 2228224 3 - Live 0xc00807d0 And size is 0x22. So its loaded between 0xc00807d0 and 0xc00807f2. From objdump, text section is: text 0010f7bc 00a0 2**4 Hence perf captured ip maps to 0x112074 which is: ( ip - start of module ) + a0 This offset 0x112074 falls out .text section which is up to 0x10f7bc In this case for module, the address 0xc00807e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To address this issue in "object code reading", skip the sample if address falls out of text section and is within the module end. Use the "text_end" member of "struct dso" to do this check. To address this issue in "perf report", exploring an option of having stubs range as part of the /proc/kallsyms, so that perf report can resolve addresses in stubs range However this patch uses text_end to skip the stub range for Object code reading testcase. Reported-by: Disha Goel Signed-off-by: Athira Rajeev Tested-by: Disha Goel Reviewed-by: Adrian Hunter --- Changelog: v3 -> v4: Fixed indent in V3 v2 -> v3: Used strtailcmp in comparison for module check and added Reviewed-by from Adrian, Tested-by from Disha. v1 -> v2: Updated comment to add description on which arch has stub and reason for skipping as suggested by Adrian tools/perf/tests/code-reading.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c index ed3815163d1b..9e6e6c985840 100644 --- a/tools/perf/tests/code-reading.c +++ b/tools/perf/tests/code-reading.c @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode, if (addr + len > map__end(al.map)) len = map__end(al.map) - addr; + /* +* Some architectures (ex: powerpc) have stubs (trampolines) in kernel +* modules to manage long jumps. Check if the ip offset falls in stubs +* sections for kernel modules. And skip module address after text end +*/ + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) { + pr_debug("skipping the module address %#"PRIx64" after text end\n", al.addr); + goto out; + } + /* Read the object code using perf */ ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)), al.addr, buf1, len); -- 2.31.1
[PATCH V3 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
The testcase "Object code reading" fails in somecases for "fs_something" sub test as below: Reading object code for memory address: 0xc00807f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 This can alo be reproduced when running perf record with workload that exercises fs_something() code. In the test setup, this is exercising xfs code since root is xfs. # perf record ./a.out # perf report -v |grep "xfs.ko" 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807de5efc B [k] xlog_cil_commit 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807d5ae18 B [k] xfs_btree_key_offset 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807e11fd4 B [k] 0x00112074 Here addr "0xc00807e11fd4" is not resolved. since this is a kernel module, its offset is from the DSO. Xfs module is loaded at 0xc00807d0 # cat /proc/modules | grep xfs xfs 2228224 3 - Live 0xc00807d0 And size is 0x22. So its loaded between 0xc00807d0 and 0xc00807f2. From objdump, text section is: text 0010f7bc 00a0 2**4 Hence perf captured ip maps to 0x112074 which is: ( ip - start of module ) + a0 This offset 0x112074 falls out .text section which is up to 0x10f7bc In this case for module, the address 0xc00807e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To address this issue in "object code reading", skip the sample if address falls out of text section and is within the module end. Use the "text_end" member of "struct dso" to do this check. To address this issue in "perf report", exploring an option of having stubs range as part of the /proc/kallsyms, so that perf report can resolve addresses in stubs range However this patch uses text_end to skip the stub range for Object code reading testcase. Reported-by: Disha Goel Signed-off-by: Athira Rajeev Tested-by: Disha Goel Reviewed-by: Adrian Hunter --- Changelog: v2 -> v3: Used strtailcmp in comparison for module check and added Reviewed-by from Adrian, Tested-by from Disha. v1 -> v2: Updated comment to add description on which arch has stub and reason for skipping as suggested by Adrian tools/perf/tests/code-reading.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c index ed3815163d1b..45334d26058e 100644 --- a/tools/perf/tests/code-reading.c +++ b/tools/perf/tests/code-reading.c @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode, if (addr + len > map__end(al.map)) len = map__end(al.map) - addr; + /* +* Some architectures (ex: powerpc) have stubs (trampolines) in kernel +* modules to manage long jumps. Check if the ip offset falls in stubs +* sections for kernel modules. And skip module address after text end +*/ + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) { + pr_debug("skipping the module address %#"PRIx64" after text end\n", al.addr); + goto out; + } + /* Read the object code using perf */ ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)), al.addr, buf1, len); -- 2.31.1
[PATCH V3 1/2] tools/perf: Add text_end to "struct dso" to save .text section size
Update "struct dso" to include new member "text_end". This new field will represent the offset for end of text section for a dso. For elf, this value is derived as: sh_size (Size of section in byes) + sh_offset (Section file offst) of the elf header for text. For bfd, this value is derived as: 1. For PE file, section->size + ( section->vma - dso->text_offset) 2. Other cases: section->filepos (file position) + section->size (size of section) To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. Example: Reading object code for memory address: 0xc00807f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 Here, module is loaded at: # cat /proc/modules | grep xfs xfs 2228224 3 - Live 0xc00807d0 >From objdump for xfs module, text section is: text 0010f7bc 00a0 2**4 Here the offset for 0xc00807f0142c ie 0x112074 falls out .text section which is up to 0x10f7bc. In this case for module, the address 0xc00807e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To identify such address, which falls out of text section and within module end, added the new field "text_end" to "struct dso". Reported-by: Disha Goel Signed-off-by: Athira Rajeev Reviewed-by: Adrian Hunter --- Changelog: v2 -> v3: Added Reviewed-by from Adrian v1 -> v2: Added text_end for bfd also by updating dso__load_bfd_symbols as suggested by Adrian. tools/perf/util/dso.h| 1 + tools/perf/util/symbol-elf.c | 4 +++- tools/perf/util/symbol.c | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index b41c9782c754..70fe0fe69bef 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -181,6 +181,7 @@ struct dso { u8 rel; struct build_id bid; u64 text_offset; + u64 text_end; const char *short_name; const char *long_name; u16 long_name_len; diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 95e99c332d7e..9e7eeaf616b8 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss, } if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, , - ".text", NULL)) + ".text", NULL)) { dso->text_offset = tshdr.sh_addr - tshdr.sh_offset; + dso->text_end = tshdr.sh_offset + tshdr.sh_size; + } if (runtime_ss->opdsec) opddata = elf_rawdata(runtime_ss->opdsec, NULL); diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 3f36675b7c8f..f25e4e62cf25 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char *debugfile) /* PE symbols can only have 4 bytes, so use .text high bits */ dso->text_offset = section->vma - (u32)section->vma; dso->text_offset += (u32)bfd_asymbol_value(symbols[i]); + dso->text_end = (section->vma - dso->text_offset) + section->size; } else { dso->text_offset = section->vma - section->filepos; + dso->text_end = section->filepos + section->size; } } -- 2.31.1
Re: [V2 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
> On 14-Sep-2023, at 11:54 PM, Adrian Hunter wrote: > > On 7/09/23 19:45, Athira Rajeev wrote: >> The testcase "Object code reading" fails in somecases >> for "fs_something" sub test as below: >> >>Reading object code for memory address: 0xc00807f0142c >>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>On file address is: 0x1114cc >>Objdump command is: objdump -z -d --start-address=0x11142c >> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>objdump read too few bytes: 128 >>test child finished with -1 >> >> This can alo be reproduced when running perf record with >> workload that exercises fs_something() code. In the test >> setup, this is exercising xfs code since root is xfs. >> >># perf record ./a.out >># perf report -v |grep "xfs.ko" >> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807de5efc B [k] xlog_cil_commit >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807d5ae18 B [k] xfs_btree_key_offset >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> 0xc00807e11fd4 B [k] 0x00112074 >> >> Here addr "0xc00807e11fd4" is not resolved. since this is a >> kernel module, its offset is from the DSO. Xfs module is loaded >> at 0xc00807d0 >> >> # cat /proc/modules | grep xfs >>xfs 2228224 3 - Live 0xc00807d0 >> >> And size is 0x22. So its loaded between 0xc00807d0 >> and 0xc00807f2. From objdump, text section is: >>text 0010f7bc 00a0 2**4 >> >> Hence perf captured ip maps to 0x112074 which is: >> ( ip - start of module ) + a0 >> >> This offset 0x112074 falls out .text section which is up to 0x10f7bc >> In this case for module, the address 0xc00807e11fd4 is pointing >> to stub instructions. This address range represents the module stubs >> which is allocated on module load and hence is not part of DSO offset. >> >> To address this issue in "object code reading", skip the sample if >> address falls out of text section and is within the module end. >> Use the "text_end" member of "struct dso" to do this check. >> >> To address this issue in "perf report", exploring an option of >> having stubs range as part of the /proc/kallsyms, so that perf >> report can resolve addresses in stubs range >> >> However this patch uses text_end to skip the stub range for >> Object code reading testcase. >> >> Reported-by: Disha Goel >> Signed-off-by: Athira Rajeev >> --- >> Changelog: >> v1 -> v2: >> Updated comment to add description on which arch has stub and >> reason for skipping as suggested by Adrian >> >> tools/perf/tests/code-reading.c | 12 >> 1 file changed, 12 insertions(+) >> >> diff --git a/tools/perf/tests/code-reading.c >> b/tools/perf/tests/code-reading.c >> index ed3815163d1b..3cf6c2d42416 100644 >> --- a/tools/perf/tests/code-reading.c >> +++ b/tools/perf/tests/code-reading.c >> @@ -269,6 +269,18 @@ static int read_object_code(u64 addr, size_t len, u8 >> cpumode, >> if (addr + len > map__end(al.map)) >> len = map__end(al.map) - addr; >> >> + /* >> + * Some architectures (ex: powerpc) have stubs (trampolines) in kernel >> + * modules to manage long jumps. Check if the ip offset falls in stubs >> + * sections for kernel modules. And skip module address after text end >> + */ >> + if (strstr(dso->long_name, ".ko")) { > > Sorry for slow reply > > !strtailcmp() is slightly better here > >> + if (al.addr > dso->text_end) { > > We normally avoid nesting if-statements e.g. > > if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) > > Make those changes and you can add: > > Reviewed-by: Adrian Hunter Sure, will post a V3 with this change Athira > > >> + pr_debug("skipping the module address %#"PRIx64" after text end\n", >> al.addr); >> + goto out; >> + } >> + } >> + >> /* Read the object code using perf */ >> ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)), >> al.addr, buf1, len);
Re: [V2 1/2] tools/perf: Add text_end to "struct dso" to save .text section size
> On 14-Sep-2023, at 11:49 PM, Adrian Hunter wrote: > > On 7/09/23 19:45, Athira Rajeev wrote: >> Update "struct dso" to include new member "text_end". >> This new field will represent the offset for end of text >> section for a dso. For elf, this value is derived as: >> sh_size (Size of section in byes) + sh_offset (Section file >> offst) of the elf header for text. >> >> For bfd, this value is derived as: >> 1. For PE file, >> section->size + ( section->vma - dso->text_offset) >> 2. Other cases: >> section->filepos (file position) + section->size (size of >> section) >> >> To resolve the address from a sample, perf looks at the >> DSO maps. In case of address from a kernel module, there >> were some address found to be not resolved. This was >> observed while running perf test for "Object code reading". >> Though the ip falls beteen the start address of the loaded >> module (perf map->start ) and end address ( perf map->end), >> it was unresolved. >> >> Example: >> >>Reading object code for memory address: 0xc00807f0142c >>File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>On file address is: 0x1114cc >>Objdump command is: objdump -z -d --start-address=0x11142c >> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >>objdump read too few bytes: 128 >>test child finished with -1 >> >> Here, module is loaded at: >># cat /proc/modules | grep xfs >>xfs 2228224 3 - Live 0xc00807d0 >> >> From objdump for xfs module, text section is: >>text 0010f7bc 00a0 2**4 >> >> Here the offset for 0xc00807f0142c ie 0x112074 falls out >> .text section which is up to 0x10f7bc. >> >> In this case for module, the address 0xc00807e11fd4 is pointing >> to stub instructions. This address range represents the module stubs >> which is allocated on module load and hence is not part of DSO offset. >> >> To identify such address, which falls out of text >> section and within module end, added the new field "text_end" to >> "struct dso". >> >> Reported-by: Disha Goel >> Signed-off-by: Athira Rajeev > > Reviewed-by: Adrian Hunter Hi Adrian Thanks for the review > >> --- >> Changelog: >> v1 -> v2: >> Added text_end for bfd also by updating dso__load_bfd_symbols >> as suggested by Adrian. >> >> tools/perf/util/dso.h| 1 + >> tools/perf/util/symbol-elf.c | 4 +++- >> tools/perf/util/symbol.c | 2 ++ >> 3 files changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h >> index b41c9782c754..70fe0fe69bef 100644 >> --- a/tools/perf/util/dso.h >> +++ b/tools/perf/util/dso.h >> @@ -181,6 +181,7 @@ struct dso { >> u8 rel; >> struct build_id bid; >> u64 text_offset; >> + u64 text_end; >> const char *short_name; >> const char *long_name; >> u16 long_name_len; >> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c >> index 95e99c332d7e..9e7eeaf616b8 100644 >> --- a/tools/perf/util/symbol-elf.c >> +++ b/tools/perf/util/symbol-elf.c >> @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map >> *map, struct symsrc *syms_ss, >> } >> >> if (elf_section_by_name(runtime_ss->elf, _ss->ehdr, , >> - ".text", NULL)) >> + ".text", NULL)) { >> dso->text_offset = tshdr.sh_addr - tshdr.sh_offset; >> + dso->text_end = tshdr.sh_offset + tshdr.sh_size; >> + } >> >> if (runtime_ss->opdsec) >> opddata = elf_rawdata(runtime_ss->opdsec, NULL); >> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c >> index 3f36675b7c8f..f25e4e62cf25 100644 >> --- a/tools/perf/util/symbol.c >> +++ b/tools/perf/util/symbol.c >> @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char >> *debugfile) >> /* PE symbols can only have 4 bytes, so use .text high bits */ >> dso->text_offset = section->vma - (u32)section->vma; >> dso->text_offset += (u32)bfd_asymbol_value(symbols[i]); >> + dso->text_end = (section->vma - dso->text_offset) + section->size; >> } else { >> dso->text_offset = section->vma - section->filepos; >> + dso->text_end = section->filepos + section->size; >> } >> }
Re: [V2 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section
> On 14-Sep-2023, at 5:47 PM, Disha Goel wrote: > > On 07/09/23 10:15 pm, Athira Rajeev wrote: >> The testcase "Object code reading" fails in somecases >> for "fs_something" sub test as below: >> >> Reading object code for memory address: 0xc00807f0142c >> File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> On file address is: 0x1114cc >> Objdump command is: objdump -z -d --start-address=0x11142c >> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko >> objdump read too few bytes: 128 >> test child finished with -1 >> >> This can alo be reproduced when running perf record with >> workload that exercises fs_something() code. In the test >> setup, this is exercising xfs code since root is xfs. >> >> # perf record ./a.out >> # perf report -v |grep "xfs.ko" >> 0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807de5efc >> B [k] xlog_cil_commit >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807d5ae18 >> B [k] xfs_btree_key_offset >> 0.74% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko 0xc00807e11fd4 >> B [k] 0x00112074 >> >> Here addr "0xc00807e11fd4" is not resolved. since this is a >> kernel module, its offset is from the DSO. Xfs module is loaded >> at 0xc00807d0 >> >> # cat /proc/modules | grep xfs >> xfs 2228224 3 - Live 0xc00807d0 >> >> And size is 0x22. So its loaded between ߦ0xc00807d0 >> and 0xc00807f2. From objdump, text section is: >> text 0010f7bc 00a0 2**4 >> >> Hence perf captured ip maps to 0x112074 which is: >> ( ip - start of module ) + a0 >> >> This offset 0x112074 falls out .text section which is up to 0x10f7bc >> In this case for module, the address 0xc00807e11fd4 is pointing >> to stub instructions. This address range represents the module stubs >> which is allocated on module load and hence is not part of DSO offset. >> >> To address this issue in "object code reading", skip the sample if >> address falls out of text section and is within the module end. >> Use the "text_end" member of "struct dso" to do this check. >> >> To address this issue in "perf report", exploring an option of >> having stubs range as part of the /proc/kallsyms, so that perf >> report can resolve addresses in stubs range >> >> However this patch uses text_end to skip the stub range for >> Object code reading testcase. >> >> Reported-by: Disha Goel >> Signed-off-by: Athira Rajeev >> --- >> Changelog: >> v1 -> v2: >> Updated comment to add description on which arch has stub and >> reason for skipping as suggested by Adrian Thanks for testing Disha. Hi Adrian, Can you please review and share feedback on this version. Thanks Athira > With this patch applied perf Object code reading test works correctly. > > 26: Object code reading : Ok > > Tested-by: Disha Goel > >> tools/perf/tests/code-reading.c | 12 >> 1 file changed, 12 insertions(+) >> >> diff --git a/tools/perf/tests/code-reading.c >> b/tools/perf/tests/code-reading.c >> index ed3815163d1b..3cf6c2d42416 100644 >> --- a/tools/perf/tests/code-reading.c >> +++ b/tools/perf/tests/code-reading.c >> @@ -269,6 +269,18 @@ static int read_object_code(u64 addr, size_t len, u8 >> cpumode, >> if (addr + len > map__end(al.map)) >> len = map__end(al.map) - addr; >> >> + /* >> + * Some architectures (ex: powerpc) have stubs (trampolines) in kernel >> + * modules to manage long jumps. Check if the ip offset falls in stubs >> + * sections for kernel modules. And skip module address after text end >> + */ >> + if (strstr(dso->long_name, ".ko")) { >> + if (al.addr > dso->text_end) { >> + pr_debug("skipping the module address %#"PRIx64" after text end\n", >> al.addr); >> + goto out; >> + } >> + } >> + >> /* Read the object code using perf */ >> ret_len = dso__data_read_offset(dso, maps__machine(thread__maps(thread)), >> al.addr, buf1, len); >>
Re: [PATCH V3] tools/perf: Add includes for detected configs in Makefile.perf
> On 13-Sep-2023, at 1:06 AM, Arnaldo Carvalho de Melo wrote: > > Em Tue, Sep 12, 2023 at 07:00:00AM -0700, Ian Rogers escreveu: >> On Mon, Sep 11, 2023 at 11:38 PM Athira Rajeev >> wrote: >>> >>> Makefile.perf uses "CONFIG_*" checks in the code. Example the config >>> for libtraceevent is used to set PYTHON_EXT_SRCS >>> >>>ifeq ($(CONFIG_LIBTRACEEVENT),y) >>> PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) >>>else >>> PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' >>> util/python-ext-sources) >>>endif >>> >>> But this is not picking the value for CONFIG_LIBTRACEEVENT that is >>> set using the settings in Makefile.config. Include the file >>> ".config-detected" so that make will use the system detected >>> configuration in the CONFIG checks. This will fix isues that >>> could arise when other "CONFIG_*" checks are added to Makefile.perf >>> in future as well. >>> >>> Signed-off-by: Athira Rajeev >> >> Reviewed-by: Ian Rogers > > Thanks, applied. > > - Arnaldo > Thanks Ian for the review and thanks Arnaldo for picking this fix Athira > >> Thanks, >> Ian >> >>> --- >>> Changelog: >>> v2 -> v3: >>> Added -include since in some cases make clean or make >>> will fail when config is not included and if config-detected >>> file is not present. >>> >>> v1 -> v2: >>> Added $(OUTPUT) prefix to config-detected as pointed >>> out by Ian >>> >>> tools/perf/Makefile.perf | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >>> index 37af6df7b978..f6fdc2d5a92f 100644 >>> --- a/tools/perf/Makefile.perf >>> +++ b/tools/perf/Makefile.perf >>> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP >>> >>> python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) >>> $(OUTPUT)python/perf*.so >>> >>> +# Use the detected configuration >>> +-include $(OUTPUT).config-detected >>> + >>> ifeq ($(CONFIG_LIBTRACEEVENT),y) >>> PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) >>> else >>> -- >>> 2.31.1 >>> > > -- > > - Arnaldo
[PATCH 2/2] tools/perf: Add perf binary dependent rule for shellcheck log in Makefile.perf
Add rule in new Makefile "tests/Makefile.tests" for running shellcheck on shell test scripts. This automates below shellcheck into the build. $ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done CONFIG_SHELLCHECK check is added to avoid build breakage in the absence of shellcheck binary. Update Makefile.perf to contain new rule for "SHELLCHECK_TEST" which is for making shellcheck test as a dependency on perf binary. Added "tests/Makefile.tests" to run shellcheck on shellscripts in tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time during make, shellcheck will be run only on modified files during subsequent invocations. By this, if any newly added shell scripts or fixes in existing scripts breaks coding/formatting style, it will get captured during the perf build. Signed-off-by: Athira Rajeev --- tools/perf/Makefile.perf| 12 +++- tools/perf/tests/Makefile.tests | 24 2 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 tools/perf/tests/Makefile.tests diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index f6fdc2d5a92f..c27f54771e90 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -667,7 +667,16 @@ $(PERF_IN): prepare FORCE $(PMU_EVENTS_IN): FORCE prepare $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=pmu-events obj=pmu-events -$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) +# Runs shellcheck on perf test shell scripts +ifeq ($(CONFIG_SHELLCHECK),y) +SHELLCHECK_TEST: FORCE prepare + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests +else +SHELLCHECK_TEST: + @: +endif + +$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) SHELLCHECK_TEST $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) \ $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@ @@ -1129,6 +1138,7 @@ bpf-skel-clean: $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) clean:: $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBSYMBOL)-clean $(LIBPERF)-clean fixdep-clean python-clean bpf-skel-clean tests-coresight-targets-clean + $(Q)$(MAKE) -f $(srctree)/tools/perf/tests/Makefile.tests clean $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-iostat $(LANG_BINDINGS) $(Q)find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete $(Q)$(RM) $(OUTPUT).config-detected diff --git a/tools/perf/tests/Makefile.tests b/tools/perf/tests/Makefile.tests new file mode 100644 index ..e74575559e83 --- /dev/null +++ b/tools/perf/tests/Makefile.tests @@ -0,0 +1,24 @@ +# SPDX-License-Identifier: GPL-2.0 +# Athira Rajeev , 2023 +-include $(OUTPUT).config-detected + +log_file = $(OUTPUT)shellcheck_test.log +PROGS = $(subst ./,,$(shell find tests/shell -perm -o=x -type f -name '*.sh')) +DEPS = $(addprefix output/,$(addsuffix .dep,$(basename $(PROGS +DIRS = $(shell echo $(dir $(DEPS)) | xargs -n1 | sort -u | xargs) + +.PHONY: all +all: SHELLCHECK_RUN + @: + +SHELLCHECK_RUN: $(DEPS) $(DIRS) + +output/%.dep: %.sh | $(DIRS) + $(call rule_mkdir) + $(Q)$(call frecho-cmd,test)@touch $@ + $(Q)$(call frecho-cmd,test)@shellcheck -S warning $(subst output/,./,$(patsubst %.dep, %.sh, $@)) 1> ${log_file} && ([[ ! -s ${log_file} ]]) +$(DIRS): + @mkdir -p $@ + +clean: + @rm -rf $(log_file) output -- 2.31.1
[PATCH 1/2] tools/perf: Add new CONFIG_SHELLCHECK for detecting shellcheck binary
shellcheck tool can detect coding/formatting issues on shell scripts. In perf directory "tests/shell", there are lot of shell test scripts and this tool can detect coding/formatting issues on these scripts. Example to use shellcheck for severity level for errors and warnings, below command is used: # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done # echo $? 0 This testing needs to be automated into the build so that it can avoid regressions and also run the check for newly added during build test itself. Add a new feature check to detect presence of shellcheck. Add CONFIG_SHELLCHECK feature check in the build to avoid not having shellcheck breaking the build. Signed-off-by: Athira Rajeev --- tools/build/Makefile.feature | 6 -- tools/build/feature/Makefile | 8 +++- tools/perf/Makefile.config | 10 ++ 3 files changed, 21 insertions(+), 3 deletions(-) diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 934e2777a2db..23f56b95babf 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -72,7 +72,8 @@ FEATURE_TESTS_BASIC := \ libzstd\ disassembler-four-args \ disassembler-init-styled \ -file-handle +file-handle\ +shellcheck # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list # of all feature tests @@ -138,7 +139,8 @@ FEATURE_DISPLAY ?= \ get_cpuid \ bpf \ libaio\ - libzstd + libzstd \ + shellcheck # # Declare group members of a feature to display the logical OR of the detection diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 3184f387990a..44ba6d0c98d0 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -76,7 +76,8 @@ FILES= \ test-libzstd.bin \ test-clang-bpf-co-re.bin \ test-file-handle.bin \ - test-libpfm4.bin + test-libpfm4.bin \ + test-shellcheck.bin FILES := $(addprefix $(OUTPUT),$(FILES)) @@ -92,6 +93,8 @@ __BUILD = $(CC) $(CFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.c,$(@F)) $( __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ $(patsubst %.bin,%.cpp,$(@F)) $(LDFLAGS) BUILDXX = $(__BUILDXX) > $(@:.bin=.make.output) 2>&1 + BUILD_BINARY = sh -c $1 > $(@:.bin=.make.output) 2>&1 + ### $(OUTPUT)test-all.bin: @@ -207,6 +210,9 @@ $(OUTPUT)test-libslang-include-subdir.bin: $(OUTPUT)test-libtraceevent.bin: $(BUILD) -ltraceevent +$(OUTPUT)test-shellcheck.bin: + $(BUILD_BINARY) "shellcheck --version" + $(OUTPUT)test-libtracefs.bin: $(BUILD) $(shell $(PKG_CONFIG) --cflags libtraceevent 2>/dev/null) -ltracefs diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index d66b52407e19..e71fe95ad865 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -779,6 +779,16 @@ ifndef NO_SLANG endif endif +ifneq ($(NO_SHELLCHECK),1) + $(call feature_check,shellcheck) + ifneq ($(feature-shellcheck), 1) +msg := $(warning No shellcheck found. please install ShellCheck); + else +$(call detected,CONFIG_SHELLCHECK) +NO_SHELLCHECK := 0 + endif +endif + ifdef GTK2 FLAGS_GTK2=$(CFLAGS) $(LDFLAGS) $(EXTLIBS) $(shell $(PKG_CONFIG) --libs --cflags gtk+-2.0 2>/dev/null) $(call feature_check,gtk2) -- 2.31.1
[PATCH 3/3] skiboot: Update IMC PMU node names for power10
The nest IMC (In Memory Collection) Performance Monitoring Unit(PMU) node names are saved as "struct nest_pmus_struct" in the "hw/imc.c" IMC code. Not all the IMC PMUs listed in the device tree may be available. Nest IMC PMU names along with their bit values is represented in imc availability vector. This struct is used to remove the unavailable nodes by checking this vector. For power10, the imc_chip_avl_vector ie, imc availability vector ( which is a part of the IMC control block structure ), has change in mapping of units and bit positions. Hence rename the existing nest_pmus array to nest_pmus_p9 and add entry for power10 as nest_pmus_p10. Also the avl_vector has another change in bit positions 11:34. These bit positions tells the availability of Xlink/Alink/CAPI. There are total 8 links and three bit field combination says which link is available. Patch implements all these change to handle nest_pmus_p10. Signed-off-by: Athira Rajeev --- Changelog: v5 -> v6: - Addressed review comment from Reza by using PPC_BIT instead of PPC_BITMASK v4 -> v5: - Addressed review comment from Reza and renamed dt_find_by_name_substr to dt_find_by_name_before_addr v3 -> v4: - Addressed review comment from Mahesh and added his Reviewed-by for patch 1. v2 -> v3: - After review comments from Mahesh, fixed the code to consider string upto "@" for both input node name as well as child node name. V2 version was comparing input node name and child node name upto string length of child name. But this will return wrong node if input name is larger than child name. Because it will match as substring for child name. https://lists.ozlabs.org/pipermail/skiboot/2023-January/018596.html v1 -> v2: - Addressed review comment from Dan to update the utility funtion to search and compare upto "@". Renamed it as dt_find_by_name_substr. hw/imc.c | 196 --- 1 file changed, 186 insertions(+), 10 deletions(-) diff --git a/hw/imc.c b/hw/imc.c index 73f25dae8..9f59348ad 100644 --- a/hw/imc.c +++ b/hw/imc.c @@ -49,7 +49,7 @@ static unsigned int *htm_scom_index; * imc_chip_avl_vector(in struct imc_chip_cb, look at include/imc.h). * nest_pmus[] is an array containing all the possible nest IMC PMU node names. */ -static char const *nest_pmus[] = { +static const char *nest_pmus_p9[] = { "powerbus0", "mcs0", "mcs1", @@ -104,6 +104,67 @@ static char const *nest_pmus[] = { /* reserved bits : 51 - 63 */ }; +static const char *nest_pmus_p10[] = { + "pb", + "mcs0", + "mcs1", + "mcs2", + "mcs3", + "mcs4", + "mcs5", + "mcs6", + "mcs7", + "pec0", + "pec1", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "NA", + "phb0", + "phb1", + "phb2", + "phb3", + "phb4", + "phb5", + "ocmb0", + "ocmb1", + "ocmb2", + "ocmb3", + "ocmb4", + "ocmb5", + "ocmb6", + "ocmb7", + "ocmb8", + "ocmb9", + "ocmb10", + "ocmb11", + "ocmb12", + "ocmb13", + "ocmb14", + "ocmb15", + "nx", +}; + /* * Due to Nest HW/OCC restriction, microcode will not support individual unit * events for these nest units mcs0, mcs1 ... mcs7 in the accumulation mode. @@ -371,7 +432,7 @@ static void disable_unavailable_units(struct dt_node *dev) uint64_t avl_vec; struct imc_chip_cb *cb; struct dt_node *target; - int i; + int i, j; bool disable_all_nests = false; struct proc_chip *chip; @@ -409,14 +470,129 @@ static void disable_unavailable_units(struct dt_node *dev) avl_vec = (0xffULL) << 56; } - for (i = 0; i < ARRAY_SIZE(nest_pmus); i++) { - if (!(PPC_BITMASK(i, i) & avl_vec)) { - /* Check if the device node exists */ - target = dt_find_by_name_before_addr(dev, nest_pmus[i]); -
[PATCH 2/3] skiboot: Update IMC code to use dt_find_by_name_before_addr for checking dt nodes
The nest IMC (In Memory Collection) Performance Monitoring Unit(PMU) node names are saved in nest_pmus[] array in the "hw/imc.c" IMC code. Not all the IMC PMUs listed in the device tree may be available. Nest IMC PMU names along with their bit values is represented in imc availability vector. The nest_pmus[] array is used to remove the unavailable nodes by checking this vector. To check node availability, code was using "dt_find_by_substr". But since the node names have format like: "name@offset", dt_find_by_name doesn't return the expected result. Fix this by using dt_find_by_name_before_addr. Also, update the char array to use correct node names. Signed-off-by: Athira Rajeev --- Changelog: v4 -> v5: - Addressed review comment from Reza and renamed dt_find_by_name_substr to dt_find_by_name_before_addr v3 -> v4: - Addressed review comment from Mahesh and added his Reviewed-by for patch 1. v2 -> v3: - After review comments from Mahesh, fixed the code to consider string upto "@" for both input node name as well as child node name. V2 version was comparing input node name and child node name upto string length of child name. But this will return wrong node if input name is larger than child name. Because it will match as substring for child name. https://lists.ozlabs.org/pipermail/skiboot/2023-January/018596.html v1 -> v2: - Addressed review comment from Dan to update the utility funtion to search and compare upto "@". Renamed it as dt_find_by_name_substr. hw/imc.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/hw/imc.c b/hw/imc.c index 97e0809f0..73f25dae8 100644 --- a/hw/imc.c +++ b/hw/imc.c @@ -67,14 +67,14 @@ static char const *nest_pmus[] = { "mba5", "mba6", "mba7", - "cen0", - "cen1", - "cen2", - "cen3", - "cen4", - "cen5", - "cen6", - "cen7", + "centaur0", + "centaur1", + "centaur2", + "centaur3", + "centaur4", + "centaur5", + "centaur6", + "centaur7", "xlink0", "xlink1", "xlink2", @@ -412,7 +412,7 @@ static void disable_unavailable_units(struct dt_node *dev) for (i = 0; i < ARRAY_SIZE(nest_pmus); i++) { if (!(PPC_BITMASK(i, i) & avl_vec)) { /* Check if the device node exists */ - target = dt_find_by_name(dev, nest_pmus[i]); + target = dt_find_by_name_before_addr(dev, nest_pmus[i]); if (!target) continue; /* Remove the device node */ -- 2.31.1
[PATCH 1/3] core/device: Add function to return child node using name at substring "@"
Add a function dt_find_by_name_before_addr() that returns the child node if it matches till first occurrence at "@" of a given name, otherwise NULL. This is helpful for cases with node name like: "name@addr". In scenarios where nodes are added with "name@addr" format and if the value of "addr" is not known, that node can't be matched with node name or addr. Hence matching with substring as node name will return the expected result. Patch adds dt_find_by_name_before_addr() function and testcase for the same in core/test/run-device.c Signed-off-by: Athira Rajeev Reviewed-by: Mahesh Salgaonkar --- Changelog: v5 -> v6: - Addressed review comment from Reza. Instead of using new variable for "node", use the node "name" as-is since the utility is to check the name before addr. Updated the test/run-device.c accordingly v4 -> v5: - Addressed review comment from Reza and renamed dt_find_by_name_substr to dt_find_by_name_before_addr v3 -> v4: - Addressed review comment from Mahesh and added his Reviewed-by. v2 -> v3: - After review comments from Mahesh, fixed the code to consider string upto "@" for both input node name as well as child node name. V2 version was comparing input node name and child node name upto string length of child name. But this will return wrong node if input name is larger than child name. Because it will match as substring for child name. https://lists.ozlabs.org/pipermail/skiboot/2023-January/018596.html v1 -> v2: - Addressed review comment from Dan to update the utility funtion to search and compare upto "@". Renamed it as dt_find_by_name_substr. core/device.c | 25 + core/test/run-device.c | 14 ++ include/device.h | 3 +++ 3 files changed, 42 insertions(+) diff --git a/core/device.c b/core/device.c index 2de37c741..c22b6b3c3 100644 --- a/core/device.c +++ b/core/device.c @@ -395,6 +395,31 @@ struct dt_node *dt_find_by_name(struct dt_node *root, const char *name) } +struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const char *name) +{ + struct dt_node *child, *match; + char *child_node = NULL; + + list_for_each(>children, child, list) { + child_node = strdup(child->name); + if (!child_node) + goto err; + child_node = strtok(child_node, "@"); + if (!strcmp(child_node, name)) { + free(child_node); + return child; + } + + match = dt_find_by_name_before_addr(child, name); + if (match) + return match; + } + + free(child_node); +err: + return NULL; +} + struct dt_node *dt_new_check(struct dt_node *parent, const char *name) { struct dt_node *node = dt_find_by_name(parent, name); diff --git a/core/test/run-device.c b/core/test/run-device.c index 4a12382bb..fb7a7d2c0 100644 --- a/core/test/run-device.c +++ b/core/test/run-device.c @@ -466,6 +466,20 @@ int main(void) new_prop_ph = dt_prop_get_u32(ut2, "something"); assert(!(new_prop_ph == ev1_ph)); dt_free(subtree); + + /* Test dt_find_by_name_before_addr */ + root = dt_new_root(""); + addr1 = dt_new_addr(root, "node", 0x1); + addr2 = dt_new_addr(root, "node0_1", 0x2); + assert(dt_find_by_name(root, "node@1") == addr1); + assert(dt_find_by_name(root, "node0_1@2") == addr2); + assert(dt_find_by_name_before_addr(root, "node") == addr1); + assert(dt_find_by_name_before_addr(root, "node0_") == NULL); + assert(dt_find_by_name_before_addr(root, "node0_1") == addr2); + assert(dt_find_by_name_before_addr(root, "node0") == NULL); + assert(dt_find_by_name_before_addr(root, "node0_") == NULL); + dt_free(root); + return 0; } diff --git a/include/device.h b/include/device.h index 93fb90ff4..f2402cc4d 100644 --- a/include/device.h +++ b/include/device.h @@ -184,6 +184,9 @@ struct dt_node *dt_find_by_path(struct dt_node *root, const char *path); /* Find a child node by name */ struct dt_node *dt_find_by_name(struct dt_node *root, const char *name); +/* Find a child node by name and substring */ +struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const char *name); + /* Find a node by phandle */ struct dt_node *dt_find_by_phandle(struct dt_node *root, u32 phandle); -- 2.31.1
Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events
> On 08-Sep-2023, at 7:48 PM, Athira Rajeev wrote: > > > >> On 08-Sep-2023, at 11:04 AM, Sachin Sant wrote: >> >> >> >>> On 07-Sep-2023, at 10:29 PM, Athira Rajeev >>> wrote: >>> >>> Testcase "Parsing of all PMU events from sysfs" parse events for >>> all PMUs, and not just cpu. In case of powerpc, the PowerVM >>> environment supports events from hv_24x7 and hv_gpci PMU which >>> is of example format like below: >>> >>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/ >>> - hv_gpci/event,partition_id=?/ >>> >>> The value for "?" needs to be filled in depending on system >>> configuration. It is better to skip these parametrized events >>> in this test as it is done in: >>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip >>> parametrized events")' which handled a simialr instance with >>> "all PMU test". >>> >>> Fix parse-events test to skip parametrized events since >>> it needs proper setup of the parameters. >>> >>> Signed-off-by: Athira Rajeev >>> --- >>> Changelog: >>> v1 -> v2: >>> Addressed review comments from Ian. Updated size of >>> pmu event name variable and changed bool name which is >>> used to skip the test. >>> >> >> The patch fixes the reported issue. >> >> 6.2: Parsing of all PMU events from sysfs : Ok >> 6.3: Parsing of given PMU events from sysfs: Ok >> >> Tested-by: Sachin Sant >> >> - Sachin > > Hi Sachin, Ian > > Thanks for testing the patch Hi Arnaldo Can you please check and pull this if it looks good to go . Thanks Athira > > Athira > >
Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"
> On 08-Sep-2023, at 7:47 PM, Athira Rajeev wrote: > > > >> On 08-Sep-2023, at 5:20 AM, Ian Rogers wrote: >> >> On Thu, Sep 7, 2023 at 10:17 AM Athira Rajeev >> wrote: >>> >>> From: root >>> >>> shellcheck was run on perf tool shell scripts s a pre-requisite >>> to include a build option for shellcheck discussed here: >>> https://www.spinics.net/lists/linux-perf-users/msg25553.html >>> >>> And fixes were added for the coding/formatting issues in >>> two patchsets: >>> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ >>> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ >>> >>> Three additional issues are observed with shellcheck "0.6" and >>> this patchset covers those. With this patchset, >>> >>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >>> warning $F; done >>> # echo $? >>> 0 >>> >>> Athira Rajeev (3): >>> tests/shell: Fix shellcheck SC1090 to handle the location of sourced >>> files >>> tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh >>> tetscase >>> tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts >> >> Series: >> Tested-by: Ian Rogers >> >> Thanks, >> Ian > > Thanks Ian for checking the patch series > > Athira Hi Arnaldo Can you please check and pull this if it looks good to go . Thanks Athira >> >>> tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 >>> tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 >>> tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 >>> tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh | 4 >>> tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 >>> tools/perf/tests/shell/probe_vfs_getname.sh | 2 ++ >>> tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++ >>> tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++ >>> tools/perf/tests/shell/record.sh | 1 + >>> tools/perf/tests/shell/stat+csv_output.sh| 1 + >>> tools/perf/tests/shell/stat+csv_summary.sh | 4 ++-- >>> tools/perf/tests/shell/stat+shadow_stat.sh | 4 ++-- >>> tools/perf/tests/shell/stat+std_output.sh| 1 + >>> tools/perf/tests/shell/test_intel_pt.sh | 1 + >>> tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 + >>> 15 files changed, 35 insertions(+), 4 deletions(-) >>> >>> -- >>> 2.31.1
[PATCH V3] tools/perf: Add includes for detected configs in Makefile.perf
Makefile.perf uses "CONFIG_*" checks in the code. Example the config for libtraceevent is used to set PYTHON_EXT_SRCS ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' util/python-ext-sources) endif But this is not picking the value for CONFIG_LIBTRACEEVENT that is set using the settings in Makefile.config. Include the file ".config-detected" so that make will use the system detected configuration in the CONFIG checks. This will fix isues that could arise when other "CONFIG_*" checks are added to Makefile.perf in future as well. Signed-off-by: Athira Rajeev --- Changelog: v2 -> v3: Added -include since in some cases make clean or make will fail when config is not included and if config-detected file is not present. v1 -> v2: Added $(OUTPUT) prefix to config-detected as pointed out by Ian tools/perf/Makefile.perf | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 37af6df7b978..f6fdc2d5a92f 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) $(OUTPUT)python/perf*.so +# Use the detected configuration +-include $(OUTPUT).config-detected + ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else -- 2.31.1
Re: [PATCH V2] tools/perf: Add includes for detected configs in Makefile.perf
> On 08-Sep-2023, at 9:45 PM, Ian Rogers wrote: > > On Fri, Sep 8, 2023 at 7:51 AM Athira Rajeev > wrote: >> >> Makefile.perf uses "CONFIG_*" checks in the code. Example the config >> for libtraceevent is used to set PYTHON_EXT_SRCS >> >>ifeq ($(CONFIG_LIBTRACEEVENT),y) >> PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) >>else >> PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' >> util/python-ext-sources) >>endif >> >> But this is not picking the value for CONFIG_LIBTRACEEVENT that is >> set using the settings in Makefile.config. Include the file >> ".config-detected" so that make will use the system detected >> configuration in the CONFIG checks. This will fix isues that >> could arise when other "CONFIG_*" checks are added to Makefile.perf >> in future as well. >> >> Signed-off-by: Athira Rajeev >> --- >> Changelog: >> v1 -> v2: >> Added $(OUTPUT) prefix to config-detected as pointed >> out by Ian >> >> tools/perf/Makefile.perf | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >> index 37af6df7b978..66b9dc61c32f 100644 >> --- a/tools/perf/Makefile.perf >> +++ b/tools/perf/Makefile.perf >> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP >> >> python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) >> $(OUTPUT)python/perf*.so >> >> +# Use the detected configuration >> +include $(OUTPUT).config-detected > > The Makefile.build version also has a "-include" rather than "include" > in case the .config-detected file is missing. In Makefile.perf > including Makefile.config is optional: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n253 > > and there are certain targets that where we don't include it: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n200 > > So playing devil's advocate, if we ran "make clean" we'd remove > .config-detected: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/Makefile.perf?h=perf-tools-next#n1131 > > If we then ran "make tags" then we wouldn't include Makefile.config > and so .config-detected wouldn't be generated and I think the build > would fail due to a missing include here. So I think this should be > -include or perhaps: Hi Ian Thanks for checking in detail. Yes, make clean in perf fails with just “include” # make clean Makefile.perf:355: .config-detected: No such file or directory make[1]: *** No rule to make target '.config-detected'. Stop. make: *** [Makefile:90: clean] Error 2 Below change will be correct as you pointed: diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 66b9dc61c32f..f6fdc2d5a92f 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -352,7 +352,7 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) $(OUTPUT)python/perf*.so # Use the detected configuration -include $(OUTPUT).config-detected +-include $(OUTPUT).config-detected ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) I could test to make sure it includes the file when it is present and picks the detected configs correctly as well with this change. Adding this change in V3 Thanks Athira > > ifeq ($(config),1) > include $(OUTPUT).config-detected > endif > > Thanks, > Ian > >> + >> ifeq ($(CONFIG_LIBTRACEEVENT),y) >> PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) >> else >> -- >> 2.31.1
Re: [PATCH V5 3/3] skiboot: Update IMC PMU node names for power10
> On 10-Aug-2023, at 3:28 AM, Reza Arbab wrote: > > On Mon, Jul 17, 2023 at 08:54:31AM +0530, Athira Rajeev wrote: >> @@ -408,14 +469,129 @@ static void disable_unavailable_units(struct dt_node >> *dev) >> avl_vec = (0xffULL) << 56; >> } >> >> - for (i = 0; i < ARRAY_SIZE(nest_pmus); i++) { >> - if (!(PPC_BITMASK(i, i) & avl_vec)) { >> - /* Check if the device node exists */ >> - target = dt_find_by_name_before_addr(dev, nest_pmus[i]); >> - if (!target) >> - continue; >> - /* Remove the device node */ >> - dt_free(target); >> + if (proc_gen == proc_gen_p9) { >> + for (i = 0; i < ARRAY_SIZE(nest_pmus_p9); i++) { >> + if (!(PPC_BITMASK(i, i) & avl_vec)) { > > I think all these PPC_BITMASK(i, i) can be changed to PPC_BIT(i). Hi Reza, Thanks for reviewing the changes. Yes. I will add the change in next version Thanks Athira > >> + /* Check if the device node exists */ >> + target = dt_find_by_name_before_addr(dev, nest_pmus_p9[i]); >> + if (!target) >> + continue; >> + /* Remove the device node */ >> + dt_free(target); >> + } >> + } >> + } else if (proc_gen == proc_gen_p10) { >> + int val; >> + char name[8]; >> + >> + for (i = 0; i < 11; i++) { >> + if (!(PPC_BITMASK(i, i) & avl_vec)) { >> + /* Check if the device node exists */ >> + target = dt_find_by_name_before_addr(dev, nest_pmus_p10[i]); >> + if (!target) >> + continue; >> + /* Remove the device node */ >> + dt_free(target); >> + } >> + } >> + >> + for (i = 35; i < 41; i++) { >> + if (!(PPC_BITMASK(i, i) & avl_vec)) { >> + /* Check if the device node exists for phb */ >> + for (j = 0; j < 3; j++) { >> + snprintf(name, sizeof(name), "phb%d_%d", (i-35), j); >> + target = dt_find_by_name_before_addr(dev, name); >> + if (!target) >> + continue; >> + /* Remove the device node */ >> + dt_free(target); >> + } >> + } >> + } >> + >> + for (i = 41; i < 58; i++) { >> + if (!(PPC_BITMASK(i, i) & avl_vec)) { >> + /* Check if the device node exists */ >> + target = dt_find_by_name_before_addr(dev, nest_pmus_p10[i]); >> + if (!target) >> + continue; >> + /* Remove the device node */ >> + dt_free(target); >> + } >> + } > > -- > Reza Arbab
Re: [PATCH V5 1/3] core/device: Add function to return child node using name at substring "@"
> On 10-Aug-2023, at 3:21 AM, Reza Arbab wrote: > > Hi Athira, > > I still have a couple of the same questions I asked in v4. > > On Mon, Jul 17, 2023 at 08:54:29AM +0530, Athira Rajeev wrote: >> Add a function dt_find_by_name_before_addr() that returns the child node if >> it matches till first occurrence at "@" of a given name, otherwise NULL. > > Given this summary, I don't userstand the following: > >> + assert(dt_find_by_name(root, "node@1") == addr1); >> + assert(dt_find_by_name(root, "node0_1@2") == addr2); > > Is this behavior required? I don't think it makes sense to call this function > with a second argument containing '@', so I wouldn't expect it to match > anything in these cases. The function seems to specifically enable it: Hi Reza, Yes makes sense. dt_find_by_name can be removed in this test since its intention is to find device by name. I will remove these two checks. > >> +struct dt_node *dt_find_by_name_before_addr(struct dt_node *root, const >> char *name) >> +{ > [snip] >> + node = strdup(name); >> + if (!node) >> + return NULL; >> + node = strtok(node, "@"); > > Seems like you could get rid of this and just use name as-is. Ok Reza > > I was curious about something else; say we have 'node@1' and 'node@2'. Is > there an expectation of which it should match? > >addr1 = dt_new_addr(root, "node", 0x1); >addr2 = dt_new_addr(root, "node", 0x2); >assert(dt_find_by_name_substr(root, "node") == ???); > ^^^ In this case, dt_find_by_name_before_addr is not the right function to use. We have other functions like dt_find_by_name_addr that can be made use of. I will address other changes in next version Thanks Athira > > -- > Reza Arbab
[PATCH V2] tools/perf: Add includes for detected configs in Makefile.perf
Makefile.perf uses "CONFIG_*" checks in the code. Example the config for libtraceevent is used to set PYTHON_EXT_SRCS ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' util/python-ext-sources) endif But this is not picking the value for CONFIG_LIBTRACEEVENT that is set using the settings in Makefile.config. Include the file ".config-detected" so that make will use the system detected configuration in the CONFIG checks. This will fix isues that could arise when other "CONFIG_*" checks are added to Makefile.perf in future as well. Signed-off-by: Athira Rajeev --- Changelog: v1 -> v2: Added $(OUTPUT) prefix to config-detected as pointed out by Ian tools/perf/Makefile.perf | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 37af6df7b978..66b9dc61c32f 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) $(OUTPUT)python/perf*.so +# Use the detected configuration +include $(OUTPUT).config-detected + ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else -- 2.31.1
Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events
> On 08-Sep-2023, at 11:04 AM, Sachin Sant wrote: > > > >> On 07-Sep-2023, at 10:29 PM, Athira Rajeev >> wrote: >> >> Testcase "Parsing of all PMU events from sysfs" parse events for >> all PMUs, and not just cpu. In case of powerpc, the PowerVM >> environment supports events from hv_24x7 and hv_gpci PMU which >> is of example format like below: >> >> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/ >> - hv_gpci/event,partition_id=?/ >> >> The value for "?" needs to be filled in depending on system >> configuration. It is better to skip these parametrized events >> in this test as it is done in: >> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip >> parametrized events")' which handled a simialr instance with >> "all PMU test". >> >> Fix parse-events test to skip parametrized events since >> it needs proper setup of the parameters. >> >> Signed-off-by: Athira Rajeev >> --- >> Changelog: >> v1 -> v2: >> Addressed review comments from Ian. Updated size of >> pmu event name variable and changed bool name which is >> used to skip the test. >> > > The patch fixes the reported issue. > > 6.2: Parsing of all PMU events from sysfs : Ok > 6.3: Parsing of given PMU events from sysfs: Ok > > Tested-by: Sachin Sant > > - Sachin Hi Sachin, Ian Thanks for testing the patch Athira
Re: [PATCH] tools/perf: Add includes for detected configs in Makefile.perf
> On 08-Sep-2023, at 4:41 AM, Ian Rogers wrote: > > On Thu, Sep 7, 2023 at 10:19 AM Athira Rajeev > wrote: >> >> Makefile.perf uses "CONFIG_*" checks in the code. Example the config >> for libtraceevent is used to set PYTHON_EXT_SRCS >> >>ifeq ($(CONFIG_LIBTRACEEVENT),y) >> PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) >>else >> PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' >> util/python-ext-sources) >>endif >> >> But this is not picking the value for CONFIG_LIBTRACEEVENT that is >> set using the settings in Makefile.config. Include the file >> ".config-detected" so that make will use the system detected >> configuration in the CONFIG checks. This will fix isues that >> could arise when other "CONFIG_*" checks are added to Makefile.perf >> in future as well. >> >> Signed-off-by: Athira Rajeev >> --- >> tools/perf/Makefile.perf | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >> index 37af6df7b978..6764b0e156f4 100644 >> --- a/tools/perf/Makefile.perf >> +++ b/tools/perf/Makefile.perf >> @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP >> >> python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) >> $(OUTPUT)python/perf*.so >> >> +# Use the detected configuration >> +include .config-detected > > Good catch! I think it should look like: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/build/Makefile.build?h=perf-tools-next#n40 > > Thanks, > Ian Thanks for the review Ian. Yes, missed the $(OUTPUT) . Will send a V2 with this change Athira > >> + >> ifeq ($(CONFIG_LIBTRACEEVENT),y) >> PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) >> else >> -- >> 2.31.1
Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"
> On 08-Sep-2023, at 5:20 AM, Ian Rogers wrote: > > On Thu, Sep 7, 2023 at 10:17 AM Athira Rajeev > wrote: >> >> From: root >> >> shellcheck was run on perf tool shell scripts s a pre-requisite >> to include a build option for shellcheck discussed here: >> https://www.spinics.net/lists/linux-perf-users/msg25553.html >> >> And fixes were added for the coding/formatting issues in >> two patchsets: >> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ >> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ >> >> Three additional issues are observed with shellcheck "0.6" and >> this patchset covers those. With this patchset, >> >> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S >> warning $F; done >> # echo $? >> 0 >> >> Athira Rajeev (3): >> tests/shell: Fix shellcheck SC1090 to handle the location of sourced >>files >> tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh >>tetscase >> tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts > > Series: > Tested-by: Ian Rogers > > Thanks, > Ian Thanks Ian for checking the patch series Athira > >> tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 >> tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 >> tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 >> tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh | 4 >> tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 >> tools/perf/tests/shell/probe_vfs_getname.sh | 2 ++ >> tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++ >> tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++ >> tools/perf/tests/shell/record.sh | 1 + >> tools/perf/tests/shell/stat+csv_output.sh| 1 + >> tools/perf/tests/shell/stat+csv_summary.sh | 4 ++-- >> tools/perf/tests/shell/stat+shadow_stat.sh | 4 ++-- >> tools/perf/tests/shell/stat+std_output.sh| 1 + >> tools/perf/tests/shell/test_intel_pt.sh | 1 + >> tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 + >> 15 files changed, 35 insertions(+), 4 deletions(-) >> >> -- >> 2.31.1
[PATCH] tools/perf: Add includes for detected configs in Makefile.perf
Makefile.perf uses "CONFIG_*" checks in the code. Example the config for libtraceevent is used to set PYTHON_EXT_SRCS ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' util/python-ext-sources) endif But this is not picking the value for CONFIG_LIBTRACEEVENT that is set using the settings in Makefile.config. Include the file ".config-detected" so that make will use the system detected configuration in the CONFIG checks. This will fix isues that could arise when other "CONFIG_*" checks are added to Makefile.perf in future as well. Signed-off-by: Athira Rajeev --- tools/perf/Makefile.perf | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 37af6df7b978..6764b0e156f4 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -351,6 +351,9 @@ export PYTHON_EXTBUILD_LIB PYTHON_EXTBUILD_TMP python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) $(OUTPUT)python/perf*.so +# Use the detected configuration +include .config-detected + ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else -- 2.31.1
[PATCH 0/3] Fix for shellcheck issues with version "0.6"
From: root shellcheck was run on perf tool shell scripts s a pre-requisite to include a build option for shellcheck discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html And fixes were added for the coding/formatting issues in two patchsets: https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/ https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/ Three additional issues are observed with shellcheck "0.6" and this patchset covers those. With this patchset, # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done # echo $? 0 Athira Rajeev (3): tests/shell: Fix shellcheck SC1090 to handle the location of sourced files tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh tetscase tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh | 4 tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 tools/perf/tests/shell/probe_vfs_getname.sh | 2 ++ tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++ tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++ tools/perf/tests/shell/record.sh | 1 + tools/perf/tests/shell/stat+csv_output.sh| 1 + tools/perf/tests/shell/stat+csv_summary.sh | 4 ++-- tools/perf/tests/shell/stat+shadow_stat.sh | 4 ++-- tools/perf/tests/shell/stat+std_output.sh| 1 + tools/perf/tests/shell/test_intel_pt.sh | 1 + tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 + 15 files changed, 35 insertions(+), 4 deletions(-) -- 2.31.1
[PATCH 3/3] tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts
Running shellcheck on some of the shell scripts, throws below warning on shellcheck v0.6. Example: In tests/shell/coresight/asm_pure_loop.sh line 14: DATA="$DATD/perf-$TEST-$DATV.data" ^---^ SC2153: Possible misspelling: DATD may not be assigned, but DATA is. Here, DATD is exported from "lib/coresight.sh" and this warning can be ignored. Use "shellcheck disable=" to ignore this check. Signed-off-by: Athira Rajeev --- tools/perf/tests/shell/coresight/asm_pure_loop.sh| 1 + tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 1 + tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 1 + tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh | 1 + tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 1 + 5 files changed, 5 insertions(+) diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop.sh b/tools/perf/tests/shell/coresight/asm_pure_loop.sh index 04387061e9f3..2d65defb7e0f 100755 --- a/tools/perf/tests/shell/coresight/asm_pure_loop.sh +++ b/tools/perf/tests/shell/coresight/asm_pure_loop.sh @@ -11,6 +11,7 @@ TEST="asm_pure_loop" ARGS="" DATV="out" +# shellcheck disable=SC2153 DATA="$DATD/perf-$TEST-$DATV.data" perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS diff --git a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh index c17e442ac741..ddcc9bb850f5 100755 --- a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh +++ b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh @@ -11,6 +11,7 @@ TEST="memcpy_thread" ARGS="16 10 1" DATV="16k_10" +# shellcheck disable=SC2153 DATA="$DATD/perf-$TEST-$DATV.data" perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh index e47c4e955d0e..2ce5e139b2fd 100755 --- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh +++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh @@ -11,6 +11,7 @@ TEST="thread_loop" ARGS="10 1" DATV="check-tid-10th" +# shellcheck disable=SC2153 DATA="$DATD/perf-$TEST-$DATV.data" STDO="$DATD/perf-$TEST-$DATV.stdout" diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh index 8bf94a02e384..3ad9498753d7 100755 --- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh +++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh @@ -11,6 +11,7 @@ TEST="thread_loop" ARGS="2 20" DATV="check-tid-2th" +# shellcheck disable=SC2153 DATA="$DATD/perf-$TEST-$DATV.data" STDO="$DATD/perf-$TEST-$DATV.stdout" diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh index 0dc9ef424233..4fbb4a29aad3 100755 --- a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh @@ -11,6 +11,7 @@ TEST="unroll_loop_thread" ARGS="10" DATV="10" +# shellcheck disable=SC2153 DATA="$DATD/perf-$TEST-$DATV.data" perf record $PERFRECOPT -o "$DATA" "$BIN" $ARGS -- 2.31.1
[PATCH 1/3] tests/shell: Fix shellcheck SC1090 to handle the location of sourced files
Running shellcheck on some of the shell scripts throws below error: In tests/shell/coresight/unroll_loop_thread_10.sh line 8: . "$(dirname $0)"/../lib/coresight.sh ^-- SC1090: Can't follow non-constant source. Use a directive to specify location. This happens on shellcheck version "0.6.0". Fix shellcheck warning for SC1090 using "shellcheck source="i option to mention the location of sourced files. Signed-off-by: Athira Rajeev --- tools/perf/tests/shell/coresight/asm_pure_loop.sh| 3 +++ tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 3 +++ tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 3 +++ tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh | 3 +++ tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 3 +++ tools/perf/tests/shell/probe_vfs_getname.sh | 2 ++ tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++ tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++ tools/perf/tests/shell/record.sh | 1 + tools/perf/tests/shell/stat+csv_output.sh| 1 + tools/perf/tests/shell/stat+std_output.sh| 1 + tools/perf/tests/shell/test_intel_pt.sh | 1 + tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 + 13 files changed, 26 insertions(+) diff --git a/tools/perf/tests/shell/coresight/asm_pure_loop.sh b/tools/perf/tests/shell/coresight/asm_pure_loop.sh index 779bc8608e1e..04387061e9f3 100755 --- a/tools/perf/tests/shell/coresight/asm_pure_loop.sh +++ b/tools/perf/tests/shell/coresight/asm_pure_loop.sh @@ -5,7 +5,10 @@ # Carsten Haitzler , 2021 TEST="asm_pure_loop" + +# shellcheck source=../lib/coresight.sh . "$(dirname $0)"/../lib/coresight.sh + ARGS="" DATV="out" DATA="$DATD/perf-$TEST-$DATV.data" diff --git a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh index 08a44e52ce9b..c17e442ac741 100755 --- a/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh +++ b/tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh @@ -5,7 +5,10 @@ # Carsten Haitzler , 2021 TEST="memcpy_thread" + +# shellcheck source=../lib/coresight.sh . "$(dirname $0)"/../lib/coresight.sh + ARGS="16 10 1" DATV="16k_10" DATA="$DATD/perf-$TEST-$DATV.data" diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh index c83a200dede4..e47c4e955d0e 100755 --- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh +++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh @@ -5,7 +5,10 @@ # Carsten Haitzler , 2021 TEST="thread_loop" + +# shellcheck source=../lib/coresight.sh . "$(dirname $0)"/../lib/coresight.sh + ARGS="10 1" DATV="check-tid-10th" DATA="$DATD/perf-$TEST-$DATV.data" diff --git a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh index 6346fd5e87c8..8bf94a02e384 100755 --- a/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh +++ b/tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh @@ -5,7 +5,10 @@ # Carsten Haitzler , 2021 TEST="thread_loop" + +# shellcheck source=../lib/coresight.sh . "$(dirname $0)"/../lib/coresight.sh + ARGS="2 20" DATV="check-tid-2th" DATA="$DATD/perf-$TEST-$DATV.data" diff --git a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh index 7304e3d3a6ff..0dc9ef424233 100755 --- a/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh +++ b/tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh @@ -5,7 +5,10 @@ # Carsten Haitzler , 2021 TEST="unroll_loop_thread" + +# shellcheck source=../lib/coresight.sh . "$(dirname $0)"/../lib/coresight.sh + ARGS="10" DATV="10" DATA="$DATD/perf-$TEST-$DATV.data" diff --git a/tools/perf/tests/shell/probe_vfs_getname.sh b/tools/perf/tests/shell/probe_vfs_getname.sh index 871243d6d03a..554e12e83c55 100755 --- a/tools/perf/tests/shell/probe_vfs_getname.sh +++ b/tools/perf/tests/shell/probe_vfs_getname.sh @@ -4,10 +4,12 @@ # SPDX-License-Identifier: GPL-2.0 # Arnaldo Carvalho de Melo , 2017 +# shellcheck source=lib/probe.sh . "$(dirname $0)"/lib/probe.sh skip_if_no_perf_probe || exit 2 +# shellcheck source=lib/probe_vfs_getname.sh . "$(dirname $0)"/lib/probe_vfs_getname.sh add_probe_vfs_getname || skip_if_no_debuginfo diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh index 89214a6d995
[PATCH 2/3] tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh tetscase
Running shellcheck on stat+shadow_stat.sh generates below warning In tests/shell/stat+csv_summary.sh line 26: while read _num _event _run _pct ^--^ SC2034: _num appears unused. Verify use (or export if used externally). ^^ SC2034: _event appears unused. Verify use (or export if used externally). ^--^ SC2034: _run appears unused. Verify use (or export if used externally). ^--^ SC2034: _pct appears unused. Verify use (or export if used externally). This variable is intentionally unused since it is needed to parse through the output. commit used "_" as a prefix for this throw away variable. But this stil shows warning with shellcheck v0.6. Fix this by only using "_" instead of prefix and variable name. Signed-off-by: Athira Rajeev --- tools/perf/tests/shell/stat+csv_summary.sh | 4 ++-- tools/perf/tests/shell/stat+shadow_stat.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/perf/tests/shell/stat+csv_summary.sh b/tools/perf/tests/shell/stat+csv_summary.sh index 8bae9c8a835e..323123ff4d19 100755 --- a/tools/perf/tests/shell/stat+csv_summary.sh +++ b/tools/perf/tests/shell/stat+csv_summary.sh @@ -10,7 +10,7 @@ set -e # perf stat -e cycles -x' ' -I1000 --interval-count 1 --summary 2>&1 | \ grep -e summary | \ -while read summary _num _event _run _pct +while read summary _ _ _ _ do if [ $summary != "summary" ]; then exit 1 @@ -23,7 +23,7 @@ done # perf stat -e cycles -x' ' -I1000 --interval-count 1 --summary --no-csv-summary 2>&1 | \ grep -e summary | \ -while read _num _event _run _pct +while read _ _ _ _ do exit 1 done diff --git a/tools/perf/tests/shell/stat+shadow_stat.sh b/tools/perf/tests/shell/stat+shadow_stat.sh index a1918a15e36a..386821462f3c 100755 --- a/tools/perf/tests/shell/stat+shadow_stat.sh +++ b/tools/perf/tests/shell/stat+shadow_stat.sh @@ -14,7 +14,7 @@ test_global_aggr() { perf stat -a --no-big-num -e cycles,instructions sleep 1 2>&1 | \ grep -e cycles -e instructions | \ - while read num evt _hash ipc rest + while read num evt _ ipc rest do # skip not counted events if [ "$num" = "&1 | \ grep ^CPU | \ - while read cpu num evt _hash ipc rest + while read cpu num evt _ ipc rest do # skip not counted events if [ "$num" = "