Hello, I am currently studying the mechanism of O3 CPU. I have enabled the O3CPUAll, BTB, and Branch debugging flags.
``` Disassembly of section .text: 0000000000010000 <.text>: 10000: f3 0f 1e fa endbr64 10004: 55 push rbp 10005: 48 89 e5 mov rbp,rsp 10008: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0 1000f: 8b 05 8b 00 00 00 mov eax,DWORD PTR [rip+0x8b] # 0x100a0 10015: 89 45 fc mov DWORD PTR [rbp-0x4],eax 10018: 8b 45 fc mov eax,DWORD PTR [rbp-0x4] 1001b: 0f b6 c0 movzx eax,al 1001e: 48 98 cdqe 10020: 48 8d 14 85 00 00 00 lea rdx,[rax*4+0x0] 10027: 00 10028: 48 8d 05 71 00 02 00 lea rax,[rip+0x20071] # 0x300a0 1002f: c7 04 02 01 00 00 00 mov DWORD PTR [rdx+rax*1],0x1 10036: 90 nop 10037: 5d pop rbp 10038: c3 ret 10039: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0] 10040: bc 00 00 01 00 mov esp,0x10000 10045: e8 b6 ff ff ff call 0x10000 1004a: eb fe jmp 0x1004a Disassembly of section .eh_frame: 0000000000010050 <.eh_frame>: 10050: 14 00 adc al,0x0 10052: 00 00 add BYTE PTR [rax],al 10054: 00 00 add BYTE PTR [rax],al 10056: 00 00 add BYTE PTR [rax],al 10058: 01 7a 52 add DWORD PTR [rdx+0x52],edi 1005b: 00 01 add BYTE PTR [rcx],al 1005d: 78 10 js 0x1006f ``` In my code as above, with the program entry point at 0x10040, the instructions at 0x1004a should be repeatedly executed, without executing the instructions after it. Transient execution cases are not ruled out, but the expectation is that only a few instructions after 0x1004a will be executed once. However, the puzzling issue is that, I have been running the simulation for a long time, but the prediction for the instructions at the infinite loop location is only 7 times. On the other hand, the conditional instructions after 0x1004a have been predicted many times, and if the simulation continues indefinitely, I expect this count to be infinite. I'm certain that in my configuration, I'm using the default cache size, which is more than enough to accommodate these instructions. However, I noticed that after the "ret" instruction at 0x10038, there is still an ICache miss. I'm using the configuration "x86-ubuntu-run.py" from gem5_library, with only the workload modified. I've been debugging the gem5 program for several days, but I haven't been able to locate the error. I'm wondering if there's anything I need to configure to make it run as expected. Thank you very much for your valuable suggestions.
_______________________________________________ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org