Le 06/05/2024 à 14:19, Athira Rajeev a écrit : > Add support to capture and parse raw instruction in objdump.
What's the purpose of using 'objdump' for reading raw instructions ? Can't they be read directly without invoking 'objdump' ? It looks odd to me to use objdump to provide readable text and then parse it back. > Currently, the perf tool infrastructure uses "--no-show-raw-insn" option > with "objdump" while disassemble. Example from powerpc with this option > for an instruction address is: Yes and that makes sense because the purpose of objdump is to provide human readable annotations, not to perform automated analysis. Am I missing something ? > > Snippet from: > objdump --start-address=<address> --stop-address=<address> -d > --no-show-raw-insn -C <vmlinux> > > c0000000010224b4: lwz r10,0(r9) > > This line "lwz r10,0(r9)" is parsed to extract instruction name, > registers names and offset. Also to find whether there is a memory > reference in the operands, "memory_ref_char" field of objdump is used. > For x86, "(" is used as memory_ref_char to tackle instructions of the > form "mov (%rax), %rcx". > > In case of powerpc, not all instructions using "(" are the only memory > instructions. Example, above instruction can also be of extended form (X > form) "lwzx r10,0,r19". Inorder to easy identify the instruction category > and extract the source/target registers, patch adds support to use raw > instruction. With raw instruction, macros are added to extract opcode > and register fields. > > "struct ins_operands" and "struct ins" is updated to carry opcode and > raw instruction binary code (raw_insn). Function "disasm_line__parse" > is updated to fill the raw instruction hex value and opcode in newly > added fields. There is no changes in existing code paths, which parses > the disassembled code. The architecture using the instruction name and > present approach is not altered. Since this approach targets powerpc, > the macro implementation is added for powerpc as of now. > > Example: > representation using --show-raw-insn in objdump gives result: > > 38 01 81 e8 ld r4,312(r1) > > Here "38 01 81 e8" is the raw instruction representation. In powerpc, > this translates to instruction form: "ld RT,DS(RA)" and binary code > as: > _____________________________________ > | 58 | RT | RA | DS | | > ------------------------------------- > 0 6 11 16 30 31 > > Function "disasm_line__parse" is updated to capture: > > line: 38 01 81 e8 ld r4,312(r1) > opcode and raw instruction "38 01 81 e8" > Raw instruction is used later to extract the reg/offset fields. > > Signed-off-by: Athira Rajeev <atraj...@linux.vnet.ibm.com> > ---