Now that I come to think of it, the subsequent instructions following the
system call should have been flushed and re-executed after the system call
was emulated.

On Tue, Oct 11, 2011 at 1:18 PM, Min Kyu Jeong <[email protected]> wrote:

> Hi, all
>
> In SE O3 CPU mode, I am experiencing a memory ordering issue between load
> and system call.
> Expected scenario is the system call is executed (emulated) and puts data
> on memory (syscall read), then the load consumes it.
> However, somehow the load (memory read) reaches the memory ahead of the
> execution of the system call, and fetches the old value.
>
> The following is the gem5 trace showing this behavior.
>
> 34690000: system.cpu.dcache: ReadReq 87000 miss
> 34692000: system.coretol2buses: recvTiming: src 2 dst -1 ReadReq 0x87000
> 34692000: system.l2: ReadReq 87000 miss
> 34692000: system.coretol2buses: The bus is now occupied from tick 34692000
> to 34693000
> 34693000: system.cpu.icache: ReadReq (ifetch) 5200 hit
> 34704000: system.membus: recvTiming: src 2 dst -1 ReadReq 0x87000
> 34704000: system.physmem-port0: recvTiming: ReadReq 0x87000
> 34704000: system.physmem: enq: 0 ReadReq 0x87000
> 34704000: system.physmem: Read of size 64 on address 0x87000
> 34704000: system.physmem: 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00
> 00 00 00
> 34704000: system.physmem: 00000010  00 00 00 00 00 00 00 00  00 00 00 00 00
> 00 00 00
> 34704000: system.physmem: 00000020  00 00 00 00 00 00 00 00  00 00 00 00 00
> 00 00 00
> 34704000: system.physmem: 00000030  00 00 00 00 00 00 00 00  00 00 00 00 00
> 00 00 00
> ...
> 34751000: system.cpu: syscall read called w/arguments 0,8192,1073750016,3
> 34751000: system.cpu.dcache: functional WriteReq 87000
> 34751000: system.coretol2buses: recvFunctional: packet src 2 dest 0 addr
> 0x87000 cmd WriteReq
> 34751000: system.cpu.icache: functional WriteReq 87000
> 34751000: system.l2: functional WriteReq 87000
> 34751000: system.membus: recvFunctional: packet src 2 dest 1 addr 0x87000
> cmd WriteReq
> 34751000: system.physmem-port0: recvFunctional: WriteReq 0x87000
> 34751000: system.physmem: Write of size 64 on address 0x87000
> 34751000: system.physmem: 00000000  31 38 30 30 20 32 37 39  30 20 0a 32 32
> 34 20 32   1800 2790  224 2
> 34751000: system.physmem: 00000010  32 38 0a 32 32 34 20 32  35 37 0a 32 32
> 36 20 32   28 224 257 226 2
> 34751000: system.physmem: 00000020  34 36 0a 32 33 30 20 32  35 34 0a 32 33
> 31 20 32   46 230 254 231 2
> 34751000: system.physmem: 00000030  34 31 0a 32 33 32 20 32  36 36 0a 32 33
> 35 20 32   41 232 266 235 2
>
> The following is the -Exec traces. (@__lib_read+24:svc is the producer
> system call and @_IO_new_file_underflow+293 is the consumer load). So the
> producer-consumer relation is there in the program order.
>
> system.cpu T0 : @__libc_read+24    :   svc                      : IntAlu :
> system.cpu T0 : @__libc_read+28    :   mov   r7, r12            : IntAlu :
>  D=0x0000000000000000
> system.cpu T0 : @__libc_read+32    :   cmns   r0, #4096         : IntAlu :
>  D=0x0000000000000000
> system.cpu T0 : @__libc_read+36    :   bxcc                     : IntAlu :
> system.cpu T0 : @_IO_file_read+17    :   add   sp, sp, #8         : IntAlu
> :  D=0x00000000befffbf8
> system.cpu T0 : @_IO_file_read+19.0  :   addi_uop   r34, sp, #0   : IntAlu
> :  D=0x00000000befffbf8
> system.cpu T0 : @_IO_file_read+19.1  :   ldr_uop   r6, [r34, #0]  : MemRead
> :  D=0x0000000000000000 A=0xbefffbf8
> system.cpu T0 : @_IO_file_read+19.2  :   ldr_uop   r35, [r34, #4] : MemRead
> :  D=0x000000000000d1df A=0xbefffbfc
> system.cpu T0 : @_IO_file_read+19.3  :   addi_uop   sp, sp, #8    : IntAlu
> :  D=0x00000000befffc00
> system.cpu T0 : @_IO_file_read+19.4  :   uopReg_uop   pc, r35     : IntAlu
> :  D=0x000000000000d1df
> system.cpu T0 : @_IO_new_file_underflow+261    :   cmps   r0, #0
>  : IntAlu :  D=0x0000000000000001
> system.cpu T0 : @_IO_new_file_underflow+263    :   b
>  : IntAlu : Predicated False
> system.cpu T0 : @_IO_new_file_underflow+265    :   ldr   r1, [r5, #8]
> : MemRead :  D=0x0000000040002000 A=0x717d0
> system.cpu T0 : @_IO_new_file_underflow+267    :   ldrd.w   r2, r3, [r5,
> #80] : MemRead :  D=0x00000000ffffffff A=0x71818
> system.cpu T0 : @_IO_new_file_underflow+271    :   adds   r1, r1, r0
>  : IntAlu :  D=0x0000000000000000
> system.cpu T0 : @_IO_new_file_underflow+273    :   str   r1, [r5, #8]
> : MemWrite :  D=0x0000000040004000 A=0x717d0
> system.cpu T0 : @_IO_new_file_underflow+275    :   cmps.w   r2, #4294967295
> : IntAlu :  D=0x0000000000000001
> system.cpu T0 : @_IO_new_file_underflow+279    :   b
>  : IntAlu :
> system.cpu T0 : @_IO_new_file_underflow+355    :   cmps.w   r3, #4294967295
> : IntAlu :  D=0x0000000000000001
> system.cpu T0 : @_IO_new_file_underflow+359    :   b
>  : IntAlu : Predicated False
> system.cpu T0 : @_IO_new_file_underflow+361    :   b
>  : IntAlu :
> system.cpu T0 : @_IO_new_file_underflow+291    :   ldr   r3, [r5, #4]
> : MemRead :  D=0x0000000040002000 A=0x717cc
> system.cpu T0 : @_IO_new_file_underflow+293    :   ldrb   r0, [r3, #0]
>  : MemRead :  D=0x0000000000000000 A=0x40002000
>
> With timing CPU, they are executed in-order and everything is fine. It
> appears that the system call is emulated at the head of the ROB, while the
> load goes ahead. I would imagine there would be a mechanism that either 1.
> suppress the load when there's older system call 2. or check the loaded
> value later and squash if it has changed. (Haven't checked the relevant code
> yet). Any idea why they are not working, or where should I look at to find
> it out?
>
> Thanks,
>
> Min
>
>
>
>
>
>
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to