The microcode that's executing is in src/arch/x86/isa/insts/romutil.py I think, and it looks like your stack is bad. That's where the vectoring microcode checks to see that it will be able to write out the interrupt stack frame, and it apparently can't. That triggers another page fault, and it has the same problem. You'll need to determine why your stack ends up out of whack, or why that code might not be handling the stack in an exactly correct way which makes it fault when it shouldn't.
Gabe On Mon, Nov 12, 2018 at 8:25 AM Kleovoulos Kalaitzidis < [email protected]> wrote: > Hello, > just to give more detail, I have attached here a part of the simout file > before the first appearance of the page fault that after keeps > executing. > > -- > Kleovoulos Kalaitzidis > Doctorant - Équipe PACAP > > Centre de recherche INRIA Rennes - Bretagne Atlantique > Bâtiment 12E, Bureau E321, Campus de Beaulieu, > 35042 Rennes Cedex, France > > ------------------------------ > > *From: *"Kleovoulos Kalaitzidis" <[email protected]> > *To: *"gem5 users mailing list" <[email protected]> > *Sent: *Monday, November 12, 2018 4:09:56 PM > *Subject: *[gem5-users] Microcode_ROM page fault not handled > > Hello everyone, > > I am currently using FS mode to simulate and execute SPEC benchmarks. The > image I use is an Ubuntu-16.04 and the kernel I built for that is > vmlinux-4-15. > To settle up the FS simulation environment, create the image file and > build the kernel I have followed Jason's instructions from here: > http://www.lowepower.com/jason/setting-up-gem5-full-system.html > I run my simulations with x86 and I have already taken some checkpoints > for FS, so now I use them to restore and execute the benchmarks. However, > after some testing > I found out that most of them after some time following the restore they > execute infinite loops of micro ops without proceeding in the total > benchmark execution, because the number of executed instructions > would not change (after some printing within execution) > > *The gem5 command to restore first checkpoint is here* > : /build/X86/gem5.opt --redirect-stdout --redirect-stderr --outdir=/outdir > /configs/example/fs.py --cpu-type=DerivO3CPU -n 1 --caches --l2cache > --mem-type=DDR4_2400_16x4 > > --mem-size=8GB --sys-clock=4GHz --cpu-clock=4GHz > --kernel=/path_to_kernel/vmlinux-4-15 > --disk-image=/path_to_image/ubuntu-min-16-04.img > > --checkpoint-dir=/path_to_checkpoint_dir/ -r 1 > > To tackle the problem I found the aforementioned recurring loop of micro > ops and I saw that it keeps executing micro ops related with instruction > *Microcode_ROM* > After some search I found this older thread where someone else had a quite > similar problem : > https://www.mail-archive.com/[email protected]/msg13058.html > > So I followed same pattern, I used the --debug-flags=Exec,LocalApic,Faults > and I get this output : > > 32985546164250: system.switch_cpus T0 : @__do_page_fault+716.32930 : > Microcode_ROM : ldst t0, HS:[t6] : MemRead : A=0xfffffe0000001fd0 > 32985546172500: system.switch_cpus T0 : @__do_page_fault+716.32890 : > Microcode_ROM : slli t4, t1, 0x4 : IntAlu : D=0x00000000000000e0 > 32985546172750: system.switch_cpus T0 : @__do_page_fault+716.32891 : > Microcode_ROM : ld t2, IDTR:[t4 + 0x8] : MemRead : D=0x00000000ffffffff > A=0xfffffe00000000e8 > 32985546173000: system.switch_cpus T0 : @__do_page_fault+716.32892 : > Microcode_ROM : ld t4, IDTR:[t4] : MemRead : D=0x81a08e00001015d0 > A=0xfffffe00000000e0 > 32985546173250: system.switch_cpus T0 : @__do_page_fault+716.32893 : > Microcode_ROM : chks , t4b, 0x3 : IntAlu : > 32985546173500: system.switch_cpus T0 : @__do_page_fault+716.32894 : > Microcode_ROM : srli t10, t4, 0x10 : IntAlu : D=0x000081a08e000010 > 32985546173750: system.switch_cpus T0 : @__do_page_fault+716.32895 : > Microcode_ROM : andi t5, t10, 0xf8 : IntAlu : D=0x0000000000000010 > 32985546174000: system.switch_cpus T0 : @__do_page_fault+716.32896 : > Microcode_ROM : andi t0w, t10w, 0x4 : IntAlu : D=0x0000000000000020 > 32985546174250: system.switch_cpus T0 : @__do_page_fault+716.32897 : > Microcode_ROM : br 0x8084 : No_OpClass : > 32985546176500: system.switch_cpus T0 : @__do_page_fault+716.32900 : > Microcode_ROM : ld t3, TSG:[t5] : MemRead : D=0x00af9b000000ffff > A=0xfffffe0000001010 > 32985546176750: system.switch_cpus T0 : @__do_page_fault+716.32901 : > Microcode_ROM : chks , t3, 0x7 : IntAlu : > 32985546177000: system.switch_cpus T0 : @__do_page_fault+716.32902 : > Microcode_ROM : wrdl %ctrl145, t3, t10 : IntAlu : D=0x000000000000abd0 > 32985546177250: system.switch_cpus T0 : @__do_page_fault+716.32903 : > Microcode_ROM : wrdh t9, t4, t2 : IntAlu : D=0xffffffff81a015d0 > 32985546177500: system.switch_cpus T0 : @__do_page_fault+716.32904 : > Microcode_ROM : rdsel t11b, t11b, %ctrl128 : IntAlu : > D=0x0000000000000000 > 32985546177750: system.switch_cpus T0 : @__do_page_fault+716.32905 : > Microcode_ROM : rdattr t10, %ctrl184, : IntAlu : D=0x000000000000abd0 > 32985546178000: system.switch_cpus T0 : @__do_page_fault+716.32906 : > Microcode_ROM : andi t10, t10, 0x3 : IntAlu : D=0x0000000000000000 > 32985546178250: system.switch_cpus T0 : @__do_page_fault+716.32907 : > Microcode_ROM : rdattr t5, %ctrl179, : IntAlu : D=0x000000000000abd0 > 32985546178500: system.switch_cpus T0 : @__do_page_fault+716.32908 : > Microcode_ROM : andi t5, t5, 0x3 : IntAlu : D=0x0000000000000000 > 32985546178750: system.switch_cpus T0 : @__do_page_fault+716.32909 : > Microcode_ROM : sub t0, t5, t10 : IntAlu : D=0x0000000000000020 > 32985546179000: system.switch_cpus T0 : @__do_page_fault+716.32910 : > Microcode_ROM : mov t11b, t0b, t0b : IntAlu : D=0x0000000000000000 > 32985546179250: system.switch_cpus T0 : @__do_page_fault+716.32911 : > Microcode_ROM : srli t12, t4, 0x20 : IntAlu : D=0x0000000081a08e00 > 32985546179500: system.switch_cpus T0 : @__do_page_fault+716.32912 : > Microcode_ROM : andi t12, t12, 0x7 : IntAlu : D=0x0000000000000000 > 32985546179750: system.switch_cpus T0 : @__do_page_fault+716.32913 : > Microcode_ROM : subi t0, t12, 0x1 : IntAlu : D=0x0000000000000008 > 32985546180000: system.switch_cpus T0 : @__do_page_fault+716.32914 : > Microcode_ROM : br 0x8096 : No_OpClass : > 32985546215500: system.switch_cpus T0 : @__do_page_fault+716.32915 : > Microcode_ROM : br 0x8098 : No_OpClass : > 32985546217500: system.switch_cpus T0 : @__do_page_fault+716.32916 : > Microcode_ROM : mov t6, t6, rsp : IntAlu : D=0xfffffe0000002000 > 32985546217750: system.switch_cpus T0 : @__do_page_fault+716.32917 : > Microcode_ROM : br 0x8099 : No_OpClass : > 32985546219750: system.switch_cpus T0 : @__do_page_fault+716.32921 : > Microcode_ROM : andi t6b, t6b, 0xf0 : IntAlu : D=0xfffffe0000002000 > 32985546220000: system.switch_cpus T0 : @__do_page_fault+716.32922 : > Microcode_ROM : subi t6, t6, 0x30 : IntAlu : D=0xfffffe0000001fd0 > 32985546220250: system.switch_cpus T0 : @__do_page_fault+716.32923 : > Microcode_ROM : wrip , t0, t9 : IntAlu : > 32985546222250: system.switch_cpus T0 : @__do_page_fault+716.32924 : > Microcode_ROM : srli t5, t4, 0x10 : IntAlu : D=0x000081a08e000010 > 32985546222500: system.switch_cpus T0 : @__do_page_fault+716.32925 : > Microcode_ROM : andi t5, t5, 0xff : IntAlu : D=0x0000000000000010 > 32985546222750: system.switch_cpus T0 : @__do_page_fault+716.32926 : > Microcode_ROM : wrdl %ctrl140, t3, t5 : IntAlu : D=0x000000000000abd0 > 32985546226500: system.switch_cpus T0 : @__do_page_fault+716.32927 : > Microcode_ROM : limm t10, 0 : IntAlu : D=0x0000000000000000 > 32985546226750: system.switch_cpus T0 : @__do_page_fault+716.32928 : > Microcode_ROM : rdsel t10w, t10w, %ctrl127 : IntAlu : > D=0x0000000000000010 > 32985546227000: system.switch_cpus T0 : @__do_page_fault+716.32929 : > Microcode_ROM : wrsel %ctrl127, t5w, : IntAlu : D=0x0000000000000010 > 32985546231500: Page-Fault: RIP 0xffffffff81057b6c: vector 14: #PF(0x3) at > 0xfffffe0000001fd0 > > This page fault keeps happening all over again and the execution never > continues. For some benchmarks it happens not far after restoring the > checkpoint, > for others it happens later and for some others it may even never appear. > I have to also mention that the checkpoint which I restore is taken in a > reasonable > time after the benchmark execution start (around 2% of committed > instructions) using AtomicSimpleCPU. Then I restore with DerivO3CPU or > another cpu type > of mine, always derived from DerivO3CPU. > > I am sorry for the long email, I tried to be as descriptive and > comprehensive as possible. I would really appreciate your help because my > knowledge over gem5 > can not really help me solve this. I am looking forward to hearing from > anyone having any idea... > Thank you a lot in advance. > > -- > Kleovoulos Kalaitzidis > Doctorant - Équipe PACAP > > Centre de recherche INRIA Rennes - Bretagne Atlantique > Bâtiment 12E, Bureau E321, Campus de Beaulieu, > 35042 Rennes Cedex, France > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
