Interesting... there's nothing conclusive here, but the symbols
on the instructions at tick 172000 show that this address is
probably TLS-related too. So the good news is that this could be
the same bug or a related one. I think the key thing is to
figure out what the Linux TLS structure is supposed to look like.
One thing that's puzzling me is why all this is coming up now
when we've run almost all of spec 2000 without any problems.
Anyone else have any ideas?
Steve
On 9/9/07, *Elliott Cooper-Balis* <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
r0 gets set in the instruction right before the load into r20 :
2174500: system.cpu0 T0 : @__strtol_internal+24 : addq
r0,r1,r0 : IntAlu : D=0x00000001200944f0
2175000: system.cpu0 T0 : @__strtol_internal+28 : ldq
r20,0(r0) : MemRead : D=0x0000000000000000 A=0x1200944f0
and it doesnt look like address 0x1200944f0 gets used as an
actual address anywhere else but here are all other
references to it :
172000: system.cpu0 T0 : @__libc_setup_tls+304 : addq
r10,r13,r16 : IntAlu : D=0x00000001200944f0
172500: system.cpu0 T0 : @__libc_setup_tls+308 : stq
r16,16(r9) : MemWrite : D=0x00000001200944f0 A=0x120092050
180000: system.cpu0 T0 : @memcpy+32 : bis
r31,r16,r12 : IntAlu : D=0x00000001200944f0
181000: system.cpu0 T0 : @memcpy+40 : bis
r31,r16,r9 : IntAlu : D=0x00000001200944f0
184000: system.cpu0 T0 : @memcpy+256 : bis
r31,r12,r0 : IntAlu : D=0x00000001200944f0
2174500: system.cpu0 T0 : @__strtol_internal+24 : addq
r0,r1,r0 : IntAlu : D=0x00000001200944f0
2175000: system.cpu0 T0 : @__strtol_internal+28 : ldq
r20,0(r0) : MemRead : D=0x0000000000000000 A=0x1200944f0
thanks again for all the help and sorry for being such pain
in the ass.
*/Steve Reinhardt <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>/* wrote:
The instruction at tick 2175000 loads r20 from memory
location 0x1200944f0 so the earlier refs are irrelevant.
The next questions are where does r0 get set immediately
prior to 2175000 (i.e. does 0x1200944f0 make sense as an
address) and where else does 0x1200944f0 get accessed...
Steve
On 9/9/07, *Elliott Cooper-Balis* < [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
here are all the instances of r20 in the specrand
benchmark. i'm sorry i can't be of more help in
debugging this issue :
4500: system.cpu0 T0 : @_start+36 : ldq
r20,-32440(r29) : MemRead : D=0x0000000120000eb8
A=0x1200907a0
15000: system.cpu0 T0 : @__libc_start_main+60 :
bis r31,r20,r15 : IntAlu :
D=0x0000000120000eb8
293000: system.cpu0 T0 : @__geteuid+20 : bis
r31,r20,r0 : IntAlu : D=0x0000000000000064
305500: system.cpu0 T0 : @__getegid+20 : bis
r31,r20,r0 : IntAlu : D=0x0000000000000064
2175000: system.cpu0 T0 : @__strtol_internal+28 :
ldq r20,0(r0) : MemRead :
D=0x0000000000000000 A=0x1200944f0
2183500: system.cpu0 T0 : @____strtoll_l_internal+56
: bis r31,r20,r11 : IntAlu :
D=0x0000000000000000
2184000: system.cpu0 T0 : @____strtoll_l_internal+60
: ldq r3,8(r20) : MemRead : A=0x8
the last of which being the instruction causing the
page fault.
elliott
*/Steve Reinhardt < [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>/* wrote:
Interesting... my guess with perl then is that
the Linux kernel is supposed to be initializing
some value in the thread-local storage that we're
not initializing. Unfortunately the only way to
track that down is usually to go reading the
kernel source... though if you find a spot where
they define a base TLS struct then that should
give it to you. Anyone else out there on the
list have any experience with this?
As far as specrand it's impossible to say what
the problem is without going backward further in
the trace to see where r20 is coming from. If
r20 also comes from reading something out of the
TLS area then it could well be the same bug.
Steve
On 9/9/07, *Elliott Cooper-Balis* <
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>>
wrote:
hey steve,
i tried both of your suggestions, and the
latter of which i think might give a good
clue as the memory address which causes the
fault is not referenced at any other point in
the program.
here is the result of grep'ing for the
address in the execution trace :
>grep 12022e50 exec.out
5278458500: system.cpu0 T0 : @__printf_fp+128
: addq r0,r1,r0 : IntAlu :
D=0x000000012022e508
5278459000: system.cpu0 T0 : @__printf_fp+132
: ldq r1,0(r0) : MemRead :
D=0x0000000000000000 A=0x12022e508
which are the 2 instructions right before the
fault and the only 2 instances of it being
referenced.
i tried digging around a little more to see
if this address in particular was causing the
problems. unfortunately, that doesn't appear
to be the case. the benchmark we have been
discussing is the Perl benchmark in SPEC06.
i ran the random number generator benchmark
as well ( 999.specrand) and here is the
execution output just before its page fault :
[EMAIL
PROTECTED]:~/Development/M5/m5-2.0b3/build/ALPHA_SE$
./m5.debug
--trace-flags=Exec,Syscall,SyscallVerbose
--trace-start=2000000
../../configs/example/se.py -c
benchmarks/999.specrand/exe/specrand_base.amd64-m64-gcc41-nn
-o "4 3943"
....
2183000: system.cpu0 T0 :
@____strtoll_l_internal+52 : bis
r31,r18,r10 : IntAlu : D=0x000000000000000a
2183500: system.cpu0 T0 :
@____strtoll_l_internal+56 : bis
r31,r20,r11 : IntAlu : D=0x0000000000000000
2184000: system.cpu0 T0 :
@____strtoll_l_internal+60 : ldq
r3,8(r20) : MemRead : A=0x8
panic: Page table fault when accessing
virtual address 0x8
@ cycle 2184000
[invoke:build/ALPHA_SE/sim/faults.cc, line 65]
Program aborted at cycle 2184000
Aborted (core dumped)
unfortunately, there doesn't appear to be (at
least to me) any similarities between the two
benchmark's output.
elliott
* /Steve Reinhardt < [EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>>/* wrote:
It's not obvious, but it does give some
clues...
The null pointer is being read from
memory address 0x12022e508, so either
that's a bogus address or the memory
location doesn't have the right value
(not getting initialized or getting
clobbered at some point).
The pointer address is computed by adding
the uniq register (put into R0 by
"call_pal rduniq") and some value (0x28)
read from -29160(r29)... I think that's
the global constant pool. The uniq reg
is used as a pointer to thread-local
storage. So basically it's reading the
null value out of thread-local storage.
It could be that that's a value that the
OS is supposed to provide but we're not
initializing it properly.
I'd do two more things to try and get
some more clues:
- run with just --trace-flags=Syscall
(and no --trace-start) to get a complete
syscall trace, then look at whatever the
last few syscalls are, and see what they
are and how closely they precede the crash
- run with just --trace-flags=Exec (and
no --trace-start) and then pipe the trace
through "egrep -i '12022e50[0-7]' " to
look at all the other references to that
memory location... is it ever written, if
it's read before is it always zero, etc.
This will take a while...
Steve
On 9/7/07, *Elliott Cooper-Balis* <
[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
here is the output. is there
anything obvious that might be broken?
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
<mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
------------------------------------------------------------------------
Yahoo! oneSearch: Finally, mobile search that
gives answers
<http://us.rd.yahoo.com/evt=48252/*http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC>,
not web links.
_______________________________________________
m5-users mailing list
m5-users@m5sim.org <mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org <mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
------------------------------------------------------------------------
Shape Yahoo! in your own image. Join our Network
Research Panel today!
<http://us.rd.yahoo.com/evt=48517/*http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7>
_______________________________________________
m5-users mailing list
m5-users@m5sim.org <mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org <mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
------------------------------------------------------------------------
Moody friends. Drama queens. Your life? Nope! - their life,
your story.
Play Sims Stories at Yahoo! Games.
<http://us.rd.yahoo.com/evt=48224/*http://sims.yahoo.com/>
_______________________________________________
m5-users mailing list
m5-users@m5sim.org <mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org <mailto:m5-users@m5sim.org>
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users