It looks like somehow O3 manages to take care of this even though none
of the other CPU models do:
TheISA::IntReg
FullO3CPU<Impl>::getSyscallArg(int i, int tid)
{
assert(i < TheISA::NumArgumentRegs);
TheISA::IntReg idx = TheISA::flattenIntIndex(this->tcBase(tid),
TheISA::ArgumentReg[i]);
TheISA::IntReg val = this->readArchIntReg(idx, tid);
#if THE_ISA == SPARC_ISA
if (bits(this->readMiscRegNoEffect(SparcISA::MISCREG_PSTATE, tid), 3, 3))
val = bits(val, 31, 0);
#endif
return val;
}
Assuming that that's actually the correct thing to do, I think the
conditional code there just needs to be added to the corresponding
function in src/cpu/simple_thread.hh (which is I believe the one that
all the other CPU models use).
Steve
On Tue, Dec 16, 2008 at 12:15 PM, Ali Saidi <[email protected]> wrote:
> That's a pretty good find.
>
> I'm not sure what the correct answer is and there is one more
> possibility: It could be a compiler bug.
>
> Anyway, to eliminate some possibilities, according to the architecture
> manual:
> SLL and SLLX shift all 64 bits of the value in R[rs1] left by the
> number of bits
> specified by the shift count, replacing the vacated positions with
> zeroes, and write
> the shifted result to R[rd].
>
> Although I can't find the specific text in the architecture manual,
> I'm pretty sure that the machine isn't supposed to go around
> clobbering the high 32 bits on anything but an explicit operation to
> do so.
>
> Looking at the linux 32 bit system call
> (http://lxr.linux.no/linux+v2.6.27.9/arch/sparc64/kernel/sys32.S
> ) it appears to do an
> sra %o0, 0, %o0 which should zero the top 32 bits of the register,
> although I can't find the 32 bit version of write.
>
> My best guess is that it should happen in getSyscallArg() although
> you'll need to provide another interface into the ISA to determine if
> it should clobber the top 32 bits or not.
>
> Ali
>
>
>
>
> On Dec 16, 2008, at 9:32 AM, Jack Whitham wrote:
>
>> Hi,
>>
>> I have found a bug in the SPARC_SE simulator in the development and
>> stable repositories. The bug could be in one of several places, see
>> below. It manifests itself when a system call uses all 64 bits of an
>> input register even though only 32 bits are valid.
>>
>>
>> The bug is reproduced by the test case at:
>> http://www.jwhitham.org.uk/ri/sparcse_bugdemo.tar.gz
>>
>> when compiled as follows:
>>> sparc-unknown-linux-gnu-gcc -o bugdemo bugdemo.c \
>>> m_swap.c -O3 -g -static -D__BIG_ENDIAN__
>>
>> and then executed as follows:
>>> build/SPARC_SE/m5.debug --trace-file=/tmp/BUG --trace-flags=Exec \
>>> configs/example/se.py -c bugdemo
>>
>> using:
>>> sparc-unknown-linux-gnu-gcc (GCC) 4.1.0
>>
>> The bug causes the following output:
>>> info: Entering event queue @ 0. Starting simulation...
>>> fatal: readBlob(0x5f0100effffc1f, ...) failed
>>> @ cycle 2061000
>>> [readBlob:build/SPARC_SE/mem/translating_port.cc, line 72]
>>
>> The left shift operation can shift values into the top 32 bits of
>> each register:
>>> 2046000: system.cpu T0 : @SwapLONG+20 :
>>> sll %o0, %o0, %o0 : IntAlu : D=0x0000005f01000000
>>
>> So the SwapLONG function actually returns 0x005f01000000015f
>> rather than 0x15f. This seems to have no effect on the other
>> instructions (the assert statements all pass).
>>
>> Then the write system call uses these top 32 bits as part of the
>> pointer to be written to stdout:
>>> 2057000: system.cpu T0 : @main+72 :
>>> add %o1, %o0, %o1 : IntAlu : D=0x005f0100effffc1f
>>> 2057500: system.cpu T0 : @main+76 :
>>> clr %i0 : IntAlu : D=0x0000000000000000
>>> 2058000: system.cpu T0 : @main+80 :
>>> call 0x25b4c <__libc_write> : IntAlu :
>>> D=0x00000000000102c0
>>> 2058500: system.cpu T0 : @main+84 :
>>> mov 0x1, %o0 : IntAlu : D=0x0000000000000001
>>
>>
>> Of course, the address 0x005f0100effffc1f is outside the legal address
>> space, so the simulator crashes. This is not what GCC expects to
>> happen!
>> I notice that GCC has optimised away some of the AND operations in
>> SwapLONG; if these had been kept, the bug would not be triggered, so
>> clearly GCC expected them to have no effect.
>>
>> I don't know whether I should change the implementation of "sll",
>> "write", tc->getSyscallArg(), or regs.readIntReg to fix this. I
>> suppose
>> that the behaviour should change depending on the type of trap used to
>> trigger the system call, since there are two (0x10 -> 32 bit syscall,
>> 0x6d -> 64 bit syscall).
>>
>>
>> Any thoughts on this?
>>
>> Thanks in advance,
>>
>> Jack
>>
>>
>> --
>> Jack Whitham
>> [email protected]
>>
>> _______________________________________________
>> m5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>
>
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users