Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU

2020-01-29 Thread Nils Asmussen
On 1/29/20 8:30 AM, Ciro Santilli wrote:
> I would also recommend opening a bug report for this at:
> https://gem5.atlassian.net/projects/GEM5/issues with the arch-arm
> component to make it easier to keep track of.

Sure, I can do that.

Nils
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU

2020-01-29 Thread Nils Asmussen
On 1/29/20 3:26 AM, Gabe Black wrote:
> It looks to me like this is because the MicroUopSetPCCPSR microop
> (uopSet_uop, the one actually setting the CPSR) is not marked as
> IsSerializeAfter. The macroop it's a part of is, but the flags on macroops,
> other than the one that says it's a macroop, don't matter since they are
> never executed. Their job is just to spit out microops which are executed.
> 
> The offending microop is not set as IsSerializeAfter, so the instructions
> behind it start getting processed before it's completed and updated the
> CPSR and exception level. The stack pointer index is resolved to a
> particular stack pointer at that point and reflects the old CPSR/exception
> level and not the new one.
> 
> A full fix from ARM would probably involve taking away the unused and
> slightly confusing flags from the macroop that don't do anything which I
> don't want to dig into myself. To get things working for you, you can
> *probably* just add IsSerializeAfter to MicroOupSetPCCPSR in
> arch/arm/isa/insts/macromem.isa on about line 690, right after IsMicroop.
> 
> So ['IsMicrop'] would become ['IsMicroop', 'IsSerializeAfter'].
> 
> That instruction/microop should unconditionally be IsSerializeAfter since
> it modifies state which is used to interpret register indices in later
> instructions, and if it isn't those instructions will be set up incorrectly
> like you're seeing here.

That fixed it indeed. Thank you very much! :)

Best regards,
Nils
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU

2020-01-28 Thread Ciro Santilli
I would also recommend opening a bug report for this at:
https://gem5.atlassian.net/projects/GEM5/issues with the arch-arm
component to make it easier to keep track of.

On Tue, Jan 28, 2020 at 4:24 PM Nils Asmussen  wrote:
>
> Hi all,
>
> I've stumbled upon an issue with ARM's return from exception (rfe) 
> instruction in combination with the O3 CPU.
>
> With the TimingSimpleCPU everything works fine. But with the DerivO3CPU it 
> seems that the restoration of the userspace
> SP register does not happen immediately. For example, look at the following 
> instruction trace:
>
> 204598: ldmstm
> 204598:   addi_uop   r35, sp, #0   : IntAlu :  D=0x00119160
> 1 --> 204598:   ldr2_uop   r701,r702, [r35, #0] : MemRead :  
> D=0x006000211e50 A=0x119160
> 204598:   add   sp, sp, #12: IntAlu :  D=0x0011916c
> 204598: ldmstm
> 204598:   ldr2_uop   r0,r1, [sp, #0] : MemRead :  D=0x 
> A=0x11916c
> 204598:   ldr2_uop   r2,r3, [sp, #8] : MemRead :  D=0x0001 
> A=0x119174
> 204598:   ldr2_uop   r4,r5, [sp, #16] : MemRead :  D=0xf020002f2020 
> A=0x11917c
> 204598:   ldr2_uop   r6,r7, [sp, #24] : MemRead :  D=0x0006 
> A=0x119184
> 2045981000:   ldr2_uop   r8,r9, [sp, #32] : MemRead :  D=0x002f228000211f40 
> A=0x11918c
> 2045981000:   ldr2_uop   r10,fp, [sp, #40] : MemRead :  D=0x00211e6c00211f50 
> A=0x119194
> 2045981000:   ldr2_uop   r12,lr, [sp, #48] : MemRead :  D=0x002d4056 
> A=0x11919c
> 2045981000:   addi_uop   sp, sp, #56   : IntAlu :  D=0x001191a4
> 2 --> 2045987000:   rfeia   sp!
> 2045987000:   rfeia   sp!  : MemRead :  D=0x2010 
> A=0x1191a4
> 2045987000:   addi_uop   sp, sp, #8: IntAlu :  D=0x001191ac
> 2045987000:   uopSet_uop   [PC,CPSR]   : IntAlu :  D=0x
> 2045993000: ldr   r2, [r8, #4]   : MemRead :  D=0x0003 
> A=0x211f44
> 2045993000: cmps   r2, #0: IntAlu :  D=0x0001
> 2045993000: addne   r10, r8, #4  : IntAlu :  D=0x00211f44
> 2045993000: movne   r4, #0   : IntAlu :  D=0x
> 2045993000: b   <_ZN6kernel8CapTable6obtainEjPNS_10CapabilityE+92> : IntAlu : 
> Predicated False
> 2045993000: ldr   r0, [r10, #4]!
> 2045993000:   ldr   r0, [r10, #4]! : MemRead :  D=0x00506780 
> A=0x211f48
> 2045993000:   addi_uop   r10, r10, #4  : IntAlu :  D=0x00211f48
> 2045993000: add   r4, r4, #1 : IntAlu :  D=0x0001
> 2045994000: ldr   r2, [r0, #0]   : MemRead :  D=0x002ee14c 
> A=0x506780
> 2045994000: ldr   r2, [r2, #8]   : MemRead :  D=0x002d95bc 
> A=0x2ee154
> 2045994000: blx   r2 : IntAlu :  D=0x002d4078
> 204600: ldmstm
> 3 --> 204600:   str_uop   r4, [sp, #24]  : MemWrite :  
> D=0x0001 A=0x119194
> 204600:   str_uop   r5, [sp, #20]  : MemWrite :  D=0xf020 
> A=0x119198
> 204600:   str_uop   r6, [sp, #16]  : MemWrite :  D=0x0006 
> A=0x11919c
> 204600:   str_uop   r7, [sp, #12]  : MemWrite :  D=0x 
> A=0x1191a0
> 204600:   str_uop   fp, [sp, #8]   : MemWrite :  D=0x00211e6c 
> A=0x1191a4
> 4 --> 204600:   str_uop   lr, [sp, #4]   : MemWrite :  
> D=0x0060 A=0x211e4c
> 204600:   subi_uop   sp, sp, #24   : IntAlu :  D=0x00211e38
> 2046006000: add   fp, sp, #20: IntAlu :  D=0x00211e4c
> 2046006000: sub   sp, sp, #24: IntAlu :  D=0x00211e20
>
> I've marked the most important lines. 1 is the place where the user space 
> SP/LR are written. 2 is the place where rfe is
> used to return from supervisor mode to user mode. 3 uses the SP for the first 
> time after returning to user mode. But
> note that the value is still 119XXX, so the SP that was used in supervisor 
> mode. At 4 the value of SP suddenly changes
> to 211XXX, as should have happen much earlier.
>
> In case it matters, I'm using a single-core system with the classical memory 
> model.
>
> Am I missing something or is there really something wrong?
>
> Best regards,
> Nils
> ___
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] ARM: rfe instruction broken on O3 CPU

2020-01-28 Thread Gabe Black
It looks to me like this is because the MicroUopSetPCCPSR microop
(uopSet_uop, the one actually setting the CPSR) is not marked as
IsSerializeAfter. The macroop it's a part of is, but the flags on macroops,
other than the one that says it's a macroop, don't matter since they are
never executed. Their job is just to spit out microops which are executed.

The offending microop is not set as IsSerializeAfter, so the instructions
behind it start getting processed before it's completed and updated the
CPSR and exception level. The stack pointer index is resolved to a
particular stack pointer at that point and reflects the old CPSR/exception
level and not the new one.

A full fix from ARM would probably involve taking away the unused and
slightly confusing flags from the macroop that don't do anything which I
don't want to dig into myself. To get things working for you, you can
*probably* just add IsSerializeAfter to MicroOupSetPCCPSR in
arch/arm/isa/insts/macromem.isa on about line 690, right after IsMicroop.

So ['IsMicrop'] would become ['IsMicroop', 'IsSerializeAfter'].

That instruction/microop should unconditionally be IsSerializeAfter since
it modifies state which is used to interpret register indices in later
instructions, and if it isn't those instructions will be set up incorrectly
like you're seeing here.

Gabe

On Tue, Jan 28, 2020 at 8:24 AM Nils Asmussen 
wrote:

> Hi all,
>
> I've stumbled upon an issue with ARM's return from exception (rfe)
> instruction in combination with the O3 CPU.
>
> With the TimingSimpleCPU everything works fine. But with the DerivO3CPU it
> seems that the restoration of the userspace
> SP register does not happen immediately. For example, look at the
> following instruction trace:
>
> 204598 <(204)%20598->: ldmstm
> 204598 <(204)%20598->:   addi_uop   r35, sp, #0   : IntAlu :
> D=0x00119160
> 1 --> 204598 <(204)%20598->:   ldr2_uop   r701,r702, [r35, #0] :
> MemRead :  D=0x006000211e50 A=0x119160
> 204598 <(204)%20598->:   add   sp, sp, #12: IntAlu :
> D=0x0011916c
> 204598 <(204)%20598->: ldmstm
> 204598 <(204)%20598->:   ldr2_uop   r0,r1, [sp, #0] : MemRead :
> D=0x A=0x11916c
> 204598 <(204)%20598->:   ldr2_uop   r2,r3, [sp, #8] : MemRead :
> D=0x0001 A=0x119174
> 204598 <(204)%20598->:   ldr2_uop   r4,r5, [sp, #16] : MemRead :
> D=0xf020002f2020 A=0x11917c
> 204598 <(204)%20598->:   ldr2_uop   r6,r7, [sp, #24] : MemRead :
> D=0x0006 A=0x119184
> 2045981000 <(204)%20598-1000>:   ldr2_uop   r8,r9, [sp, #32] : MemRead :
> D=0x002f228000211f40 A=0x11918c
> 2045981000 <(204)%20598-1000>:   ldr2_uop   r10,fp, [sp, #40] : MemRead
> :  D=0x00211e6c00211f50 A=0x119194
> 2045981000 <(204)%20598-1000>:   ldr2_uop   r12,lr, [sp, #48] : MemRead
> :  D=0x002d4056 A=0x11919c
> 2045981000 <(204)%20598-1000>:   addi_uop   sp, sp, #56   : IntAlu :
> D=0x001191a4
> 2 --> 2045987000 <(204)%20598-7000>:   rfeia   sp!
> 2045987000 <(204)%20598-7000>:   rfeia   sp!  : MemRead :
> D=0x2010 A=0x1191a4
> 2045987000 <(204)%20598-7000>:   addi_uop   sp, sp, #8: IntAlu :
> D=0x001191ac
> 2045987000 <(204)%20598-7000>:   uopSet_uop   [PC,CPSR]   : IntAlu :
> D=0x
> 2045993000 <(204)%20599-3000>: ldr   r2, [r8, #4]   : MemRead :
> D=0x0003 A=0x211f44
> 2045993000 <(204)%20599-3000>: cmps   r2, #0: IntAlu :
> D=0x0001
> 2045993000 <(204)%20599-3000>: addne   r10, r8, #4  : IntAlu :
> D=0x00211f44
> 2045993000 <(204)%20599-3000>: movne   r4, #0   : IntAlu :
> D=0x
> 2045993000 <(204)%20599-3000>: b
>  <_ZN6kernel8CapTable6obtainEjPNS_10CapabilityE+92> : IntAlu : Predicated
> False
> 2045993000 <(204)%20599-3000>: ldr   r0, [r10, #4]!
> 2045993000 <(204)%20599-3000>:   ldr   r0, [r10, #4]! : MemRead :
> D=0x00506780 A=0x211f48
> 2045993000 <(204)%20599-3000>:   addi_uop   r10, r10, #4  : IntAlu :
> D=0x00211f48
> 2045993000 <(204)%20599-3000>: add   r4, r4, #1 : IntAlu :
> D=0x0001
> 2045994000 <(204)%20599-4000>: ldr   r2, [r0, #0]   : MemRead :
> D=0x002ee14c A=0x506780
> 2045994000 <(204)%20599-4000>: ldr   r2, [r2, #8]   : MemRead :
> D=0x002d95bc A=0x2ee154
> 2045994000 <(204)%20599-4000>: blx   r2 : IntAlu :
> D=0x002d4078
> 204600 <(204)%20600->: ldmstm
> 3 --> 204600 <(204)%20600->:   str_uop   r4, [sp, #24]  :
> MemWrite :  D=0x0001 A=0x119194
> 204600 <(204)%20600->:   str_uop   r5, [sp, #20]  : MemWrite :
> D=0xf020 A=0x119198
> 204600 <(204)%20600->:   str_uop   r6, [sp, #16]  : MemWrite :
> D=0x0006 A=0x11919c
> 204600 <(204)%20600->:   str_uop   r7, [sp, #12]  : MemWrite :
> D=0x A=0x1191a0
> 204600 <(204)%20600->: