Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU
On 1/29/20 8:30 AM, Ciro Santilli wrote: > I would also recommend opening a bug report for this at: > https://gem5.atlassian.net/projects/GEM5/issues with the arch-arm > component to make it easier to keep track of. Sure, I can do that. Nils ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU
On 1/29/20 3:26 AM, Gabe Black wrote: > It looks to me like this is because the MicroUopSetPCCPSR microop > (uopSet_uop, the one actually setting the CPSR) is not marked as > IsSerializeAfter. The macroop it's a part of is, but the flags on macroops, > other than the one that says it's a macroop, don't matter since they are > never executed. Their job is just to spit out microops which are executed. > > The offending microop is not set as IsSerializeAfter, so the instructions > behind it start getting processed before it's completed and updated the > CPSR and exception level. The stack pointer index is resolved to a > particular stack pointer at that point and reflects the old CPSR/exception > level and not the new one. > > A full fix from ARM would probably involve taking away the unused and > slightly confusing flags from the macroop that don't do anything which I > don't want to dig into myself. To get things working for you, you can > *probably* just add IsSerializeAfter to MicroOupSetPCCPSR in > arch/arm/isa/insts/macromem.isa on about line 690, right after IsMicroop. > > So ['IsMicrop'] would become ['IsMicroop', 'IsSerializeAfter']. > > That instruction/microop should unconditionally be IsSerializeAfter since > it modifies state which is used to interpret register indices in later > instructions, and if it isn't those instructions will be set up incorrectly > like you're seeing here. That fixed it indeed. Thank you very much! :) Best regards, Nils ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU
I would also recommend opening a bug report for this at: https://gem5.atlassian.net/projects/GEM5/issues with the arch-arm component to make it easier to keep track of. On Tue, Jan 28, 2020 at 4:24 PM Nils Asmussen wrote: > > Hi all, > > I've stumbled upon an issue with ARM's return from exception (rfe) > instruction in combination with the O3 CPU. > > With the TimingSimpleCPU everything works fine. But with the DerivO3CPU it > seems that the restoration of the userspace > SP register does not happen immediately. For example, look at the following > instruction trace: > > 204598: ldmstm > 204598: addi_uop r35, sp, #0 : IntAlu : D=0x00119160 > 1 --> 204598: ldr2_uop r701,r702, [r35, #0] : MemRead : > D=0x006000211e50 A=0x119160 > 204598: add sp, sp, #12: IntAlu : D=0x0011916c > 204598: ldmstm > 204598: ldr2_uop r0,r1, [sp, #0] : MemRead : D=0x > A=0x11916c > 204598: ldr2_uop r2,r3, [sp, #8] : MemRead : D=0x0001 > A=0x119174 > 204598: ldr2_uop r4,r5, [sp, #16] : MemRead : D=0xf020002f2020 > A=0x11917c > 204598: ldr2_uop r6,r7, [sp, #24] : MemRead : D=0x0006 > A=0x119184 > 2045981000: ldr2_uop r8,r9, [sp, #32] : MemRead : D=0x002f228000211f40 > A=0x11918c > 2045981000: ldr2_uop r10,fp, [sp, #40] : MemRead : D=0x00211e6c00211f50 > A=0x119194 > 2045981000: ldr2_uop r12,lr, [sp, #48] : MemRead : D=0x002d4056 > A=0x11919c > 2045981000: addi_uop sp, sp, #56 : IntAlu : D=0x001191a4 > 2 --> 2045987000: rfeia sp! > 2045987000: rfeia sp! : MemRead : D=0x2010 > A=0x1191a4 > 2045987000: addi_uop sp, sp, #8: IntAlu : D=0x001191ac > 2045987000: uopSet_uop [PC,CPSR] : IntAlu : D=0x > 2045993000: ldr r2, [r8, #4] : MemRead : D=0x0003 > A=0x211f44 > 2045993000: cmps r2, #0: IntAlu : D=0x0001 > 2045993000: addne r10, r8, #4 : IntAlu : D=0x00211f44 > 2045993000: movne r4, #0 : IntAlu : D=0x > 2045993000: b <_ZN6kernel8CapTable6obtainEjPNS_10CapabilityE+92> : IntAlu : > Predicated False > 2045993000: ldr r0, [r10, #4]! > 2045993000: ldr r0, [r10, #4]! : MemRead : D=0x00506780 > A=0x211f48 > 2045993000: addi_uop r10, r10, #4 : IntAlu : D=0x00211f48 > 2045993000: add r4, r4, #1 : IntAlu : D=0x0001 > 2045994000: ldr r2, [r0, #0] : MemRead : D=0x002ee14c > A=0x506780 > 2045994000: ldr r2, [r2, #8] : MemRead : D=0x002d95bc > A=0x2ee154 > 2045994000: blx r2 : IntAlu : D=0x002d4078 > 204600: ldmstm > 3 --> 204600: str_uop r4, [sp, #24] : MemWrite : > D=0x0001 A=0x119194 > 204600: str_uop r5, [sp, #20] : MemWrite : D=0xf020 > A=0x119198 > 204600: str_uop r6, [sp, #16] : MemWrite : D=0x0006 > A=0x11919c > 204600: str_uop r7, [sp, #12] : MemWrite : D=0x > A=0x1191a0 > 204600: str_uop fp, [sp, #8] : MemWrite : D=0x00211e6c > A=0x1191a4 > 4 --> 204600: str_uop lr, [sp, #4] : MemWrite : > D=0x0060 A=0x211e4c > 204600: subi_uop sp, sp, #24 : IntAlu : D=0x00211e38 > 2046006000: add fp, sp, #20: IntAlu : D=0x00211e4c > 2046006000: sub sp, sp, #24: IntAlu : D=0x00211e20 > > I've marked the most important lines. 1 is the place where the user space > SP/LR are written. 2 is the place where rfe is > used to return from supervisor mode to user mode. 3 uses the SP for the first > time after returning to user mode. But > note that the value is still 119XXX, so the SP that was used in supervisor > mode. At 4 the value of SP suddenly changes > to 211XXX, as should have happen much earlier. > > In case it matters, I'm using a single-core system with the classical memory > model. > > Am I missing something or is there really something wrong? > > Best regards, > Nils > ___ > gem5-dev mailing list > gem5-dev@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev
Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU
It looks to me like this is because the MicroUopSetPCCPSR microop (uopSet_uop, the one actually setting the CPSR) is not marked as IsSerializeAfter. The macroop it's a part of is, but the flags on macroops, other than the one that says it's a macroop, don't matter since they are never executed. Their job is just to spit out microops which are executed. The offending microop is not set as IsSerializeAfter, so the instructions behind it start getting processed before it's completed and updated the CPSR and exception level. The stack pointer index is resolved to a particular stack pointer at that point and reflects the old CPSR/exception level and not the new one. A full fix from ARM would probably involve taking away the unused and slightly confusing flags from the macroop that don't do anything which I don't want to dig into myself. To get things working for you, you can *probably* just add IsSerializeAfter to MicroOupSetPCCPSR in arch/arm/isa/insts/macromem.isa on about line 690, right after IsMicroop. So ['IsMicrop'] would become ['IsMicroop', 'IsSerializeAfter']. That instruction/microop should unconditionally be IsSerializeAfter since it modifies state which is used to interpret register indices in later instructions, and if it isn't those instructions will be set up incorrectly like you're seeing here. Gabe On Tue, Jan 28, 2020 at 8:24 AM Nils Asmussen wrote: > Hi all, > > I've stumbled upon an issue with ARM's return from exception (rfe) > instruction in combination with the O3 CPU. > > With the TimingSimpleCPU everything works fine. But with the DerivO3CPU it > seems that the restoration of the userspace > SP register does not happen immediately. For example, look at the > following instruction trace: > > 204598 <(204)%20598->: ldmstm > 204598 <(204)%20598->: addi_uop r35, sp, #0 : IntAlu : > D=0x00119160 > 1 --> 204598 <(204)%20598->: ldr2_uop r701,r702, [r35, #0] : > MemRead : D=0x006000211e50 A=0x119160 > 204598 <(204)%20598->: add sp, sp, #12: IntAlu : > D=0x0011916c > 204598 <(204)%20598->: ldmstm > 204598 <(204)%20598->: ldr2_uop r0,r1, [sp, #0] : MemRead : > D=0x A=0x11916c > 204598 <(204)%20598->: ldr2_uop r2,r3, [sp, #8] : MemRead : > D=0x0001 A=0x119174 > 204598 <(204)%20598->: ldr2_uop r4,r5, [sp, #16] : MemRead : > D=0xf020002f2020 A=0x11917c > 204598 <(204)%20598->: ldr2_uop r6,r7, [sp, #24] : MemRead : > D=0x0006 A=0x119184 > 2045981000 <(204)%20598-1000>: ldr2_uop r8,r9, [sp, #32] : MemRead : > D=0x002f228000211f40 A=0x11918c > 2045981000 <(204)%20598-1000>: ldr2_uop r10,fp, [sp, #40] : MemRead > : D=0x00211e6c00211f50 A=0x119194 > 2045981000 <(204)%20598-1000>: ldr2_uop r12,lr, [sp, #48] : MemRead > : D=0x002d4056 A=0x11919c > 2045981000 <(204)%20598-1000>: addi_uop sp, sp, #56 : IntAlu : > D=0x001191a4 > 2 --> 2045987000 <(204)%20598-7000>: rfeia sp! > 2045987000 <(204)%20598-7000>: rfeia sp! : MemRead : > D=0x2010 A=0x1191a4 > 2045987000 <(204)%20598-7000>: addi_uop sp, sp, #8: IntAlu : > D=0x001191ac > 2045987000 <(204)%20598-7000>: uopSet_uop [PC,CPSR] : IntAlu : > D=0x > 2045993000 <(204)%20599-3000>: ldr r2, [r8, #4] : MemRead : > D=0x0003 A=0x211f44 > 2045993000 <(204)%20599-3000>: cmps r2, #0: IntAlu : > D=0x0001 > 2045993000 <(204)%20599-3000>: addne r10, r8, #4 : IntAlu : > D=0x00211f44 > 2045993000 <(204)%20599-3000>: movne r4, #0 : IntAlu : > D=0x > 2045993000 <(204)%20599-3000>: b > <_ZN6kernel8CapTable6obtainEjPNS_10CapabilityE+92> : IntAlu : Predicated > False > 2045993000 <(204)%20599-3000>: ldr r0, [r10, #4]! > 2045993000 <(204)%20599-3000>: ldr r0, [r10, #4]! : MemRead : > D=0x00506780 A=0x211f48 > 2045993000 <(204)%20599-3000>: addi_uop r10, r10, #4 : IntAlu : > D=0x00211f48 > 2045993000 <(204)%20599-3000>: add r4, r4, #1 : IntAlu : > D=0x0001 > 2045994000 <(204)%20599-4000>: ldr r2, [r0, #0] : MemRead : > D=0x002ee14c A=0x506780 > 2045994000 <(204)%20599-4000>: ldr r2, [r2, #8] : MemRead : > D=0x002d95bc A=0x2ee154 > 2045994000 <(204)%20599-4000>: blx r2 : IntAlu : > D=0x002d4078 > 204600 <(204)%20600->: ldmstm > 3 --> 204600 <(204)%20600->: str_uop r4, [sp, #24] : > MemWrite : D=0x0001 A=0x119194 > 204600 <(204)%20600->: str_uop r5, [sp, #20] : MemWrite : > D=0xf020 A=0x119198 > 204600 <(204)%20600->: str_uop r6, [sp, #16] : MemWrite : > D=0x0006 A=0x11919c > 204600 <(204)%20600->: str_uop r7, [sp, #12] : MemWrite : > D=0x A=0x1191a0 > 204600 <(204)%20600->: