Steve, 
  So after rebuilding the crosscompiler without TLS, the benchmarks worked!  
Thanks again for all the help and patience.  

Elliott

Elliott Cooper-Balis <[EMAIL PROTECTED]> wrote: Ali, 
  To be honest, I'm not entirely sure.  I just used the default script to build 
the alpha versions of gcc/g++/gfortran that came with crosstools ( 
http://www.kegel.com/crosstool/ ).  When I get home, I will see if there are 
options for TLS in the script.  Thanks

Elliott

Ali Saidi <[EMAIL PROTECTED]> wrote:  Elliott, Did you compile your toolchain 
with or without TLS? While I haven't run spec2006, I've always used a non-tls 
tool chain. Perhaps that is another way around the problem? Or you could see 
what values are put in the TLS area and fill them in see 
(src/arch/alpha/process.* and src/sim/process.*)

Ali



On Sep 10, 2007, at 1:52 AM, Steve Reinhardt wrote:

Interesting... there's nothing conclusive here, but the symbols on the 
instructions at tick 172000 show that this address is probably TLS-related too. 
 So the good news is that this could be the same bug or a related one.  I think 
the key thing is to figure out what the Linux TLS structure is supposed to look 
like. 

One thing that's puzzling me is why all this is coming up now when we've run 
almost all of spec 2000 without any problems.

Anyone else have any ideas?

Steve

 On 9/9/07, Elliott Cooper-Balis <[EMAIL PROTECTED]> wrote: r0 gets set in the 
instruction right before the load into r20 : 

2174500: system.cpu0 T0 :  @__strtol_internal+24 : addq       r0,r1,r0        : 
IntAlu :  D=0x00000001200944f0
2175000: system.cpu0 T0 : @__strtol_internal+28 : ldq        r20,0(r0)       : 
MemRead :  D=0x0000000000000000 A=0x1200944f0


and it doesnt look like address 0x1200944f0 gets used as an actual address 
anywhere else but here are all other references to it : 

 172000: system.cpu0 T0 : @__libc_setup_tls+304 : addq       r10,r13,r16     : 
IntAlu :  D=0x00000001200944f0
 172500: system.cpu0 T0 : @__libc_setup_tls+308 : stq        r16,16(r9)      : 
MemWrite :  D=0x00000001200944f0 A=0x120092050 
 180000: system.cpu0 T0 : @memcpy+32 : bis         r31,r16,r12     : IntAlu :  
D=0x00000001200944f0
 181000: system.cpu0 T0 : @memcpy+40 : bis        r31,r16,r9      : IntAlu :  
D=0x00000001200944f0
 184000: system.cpu0 T0 : @memcpy+256 : bis        r31,r12,r0      : IntAlu :  
D=0x00000001200944f0 
2174500: system.cpu0 T0 : @__strtol_internal+24 : addq       r0,r1,r0        : 
IntAlu :  D=0x00000001200944f0
2175000: system.cpu0 T0 : @__strtol_internal+28 : ldq        r20,0(r0)       : 
MemRead :  D=0x0000000000000000 A=0x1200944f0 


thanks again for all the help and sorry for being such pain in the ass.

Steve Reinhardt  <[EMAIL PROTECTED]> wrote: The instruction at tick 2175000 
loads r20 from memory location 0x1200944f0 so the earlier refs are irrelevant.  
The next questions are where does r0 get set immediately prior to 2175000 (i.e. 
does 0x1200944f0 make sense as an address) and where else does 0x1200944f0 get 
accessed... 

Steve

On 9/9/07, Elliott Cooper-Balis < [EMAIL PROTECTED]> wrote: here are all the 
instances of r20 in the specrand  benchmark.  i'm sorry i can't be of more help 
in debugging this issue :  

   4500: system.cpu0 T0 : @_start+36 : ldq        r20,-32440(r29) : MemRead :  
D=0x0000000120000eb8 A=0x1200907a0 
  15000: system.cpu0 T0 : @__libc_start_main+60 : bis        r31,r20,r15     : 
IntAlu :  D=0x0000000120000eb8
 293000: system.cpu0 T0 : @__geteuid+20 : bis        r31,r20,r0      : IntAlu : 
 D=0x0000000000000064 
  305500: system.cpu0 T0 : @__getegid+20 : bis        r31,r20,r0      : IntAlu 
:  D=0x0000000000000064
2175000: system.cpu0 T0 : @__strtol_internal+28 : ldq        r20,0(r0)       : 
MemRead  :  D=0x0000000000000000 A=0x1200944f0  
2183500: system.cpu0 T0 : @____strtoll_l_internal+56 : bis        r31,r20,r11   
  : IntAlu :  D=0x0000000000000000
2184000: system.cpu0 T0 : @____strtoll_l_internal+60 : ldq        r3,8(r20)     
  : MemRead :  A=0x8 


the last of which being the instruction causing the page fault. 

elliott

Steve Reinhardt < [EMAIL PROTECTED]> wrote: Interesting... my guess with perl 
then is that the Linux kernel is supposed to be initializing some value in the 
thread-local storage that we're not initializing.  Unfortunately the only way 
to track that down is usually to go reading  the kernel source... though if you 
find a spot where they define a base TLS  struct then that should give it to 
you.  Anyone else out there on the list have any experience with this? 

As far as specrand it's impossible to say what the problem is without going 
backward further in the trace to see where r20 is coming from.  If r20 also 
comes from reading something out of the TLS area then it could well be the same 
bug.  

Steve

On 9/9/07, Elliott Cooper-Balis < [EMAIL PROTECTED]> wrote: hey steve,
  i tried both of your suggestions, and the latter of which i think might give 
a good clue as the memory address which causes the  fault is not referenced at 
any other point in the program.  
  
  here is the result of grep'ing for the address in the execution trace  :  

 >grep 12022e50 exec.out 
5278458500: system.cpu0 T0 : @__printf_fp+128 : addq       r0,r1,r0        : 
IntAlu :  D=0x000000012022e508 
5278459000: system.cpu0 T0 : @__printf_fp+132 : ldq        r1,0(r0)        : 
MemRead :  D=0x0000000000000000 A=0x12022e508  

which are the 2 instructions right before the fault and the only 2 instances of 
it being referenced. 

i tried digging around a little more to see if this address in particular was 
causing the problems.  unfortunately, that doesn't appear to be  the case.  the 
benchmark we have been discussing is the Perl benchmark in SPEC06.  i ran the 
random number generator benchmark as  well ( 999.specrand) and here is the 
execution output just before its page fault : 

 [EMAIL PROTECTED]:~/Development/M5/m5-2.0b3/build/ALPHA_SE$  ./m5.debug 
--trace-flags=Exec,Syscall,SyscallVerbose --trace-start=2000000 
../../configs/example/se.py -c 
benchmarks/999.specrand/exe/specrand_base.amd64-m64-gcc41-nn -o "4 3943" 

....

2183000: system.cpu0  T0 : @____strtoll_l_internal+52 : bis        r31,r18,r10  
   : IntAlu :  D=0x000000000000000a
2183500: system.cpu0 T0 : @____strtoll_l_internal+56 : bis        r31,r20,r11   
  : IntAlu :  D=0x0000000000000000 
2184000:  system.cpu0 T0 : @____strtoll_l_internal+60 : ldq         r3,8(r20)   
    : MemRead :  A=0x8
panic: Page table fault when accessing virtual address 0x8
 @ cycle  2184000
[invoke:build/ALPHA_SE/sim/faults.cc, line 65] 
Program aborted at cycle 2184000 
Aborted (core dumped)

unfortunately, there doesn't  appear to be (at least to me) any similarities 
between the two benchmark's output.  


elliott

  Steve Reinhardt < [EMAIL PROTECTED]> wrote:  It's not obvious, but it does 
give some clues...

The null pointer is being read from memory address 0x12022e508, so either 
that's a bogus address or the memory location doesn't have the right value (not 
getting initialized or getting clobbered at some point).   

The pointer address is computed by adding the uniq register (put into R0 by 
"call_pal rduniq") and some value  (0x28)  read from -29160(r29)... I think 
that's the global constant pool.  The uniq reg is used as a pointer to  
thread-local storage.  So basically it's reading the null value out of 
thread-local storage.  It could be that that's a value that the OS is supposed 
to provide but we're not initializing it properly.  

 I'd do two more things to try and get some more clues:

- run with just --trace-flags=Syscall (and no --trace-start) to get a complete 
syscall trace, then look at whatever the last few syscalls are, and see what 
they are and how closely they precede the crash   
- run with just --trace-flags=Exec (and no --trace-start) and then pipe the 
trace through "egrep -i '12022e50[0-7]' " to look at all the other references 
to that memory location... is it ever written, if it's read before is it always 
zero, etc.  This will take a while...   

Steve

On 9/7/07, Elliott  Cooper-Balis < [EMAIL PROTECTED]> wrote:  here is the 
output.  is there anything obvious that might be broken? 



 _______________________________________________
m5-users mailing list
  m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users   
        

---------------------------------
Yahoo! oneSearch: Finally,    mobile search  that gives answers, not web links. 
  

_______________________________________________
m5-users mailing list
 m5-users@m5sim.org
 http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
 

 _______________________________________________
m5-users mailing list
m5-users@m5sim.org
  http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

         

---------------------------------
Shape Yahoo! in your own image.    Join our Network Research Panel today! 

_______________________________________________
m5-users mailing list
 m5-users@m5sim.org
 http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
 

 _______________________________________________
m5-users mailing list
m5-users@m5sim.org
 http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

       


---------------------------------
Moody friends. Drama queens. Your life? Nope! - their life, your story.
  Play Sims Stories at Yahoo! Games. 


_______________________________________________
m5-users mailing list
 m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
 

_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
 



_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
        

---------------------------------
Be a better Globetrotter. Get better travel answers from someone who knows.
Yahoo! Answers - Check it out.  _______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

       
---------------------------------
Park yourself in front of a world of choices in alternative vehicles.
Visit the Yahoo! Auto Green Center.
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to