On Monday 22 January 2007 21:59, Jeff Dike wrote:
> On Sat, Jan 20, 2007 at 12:18:30AM +0100, Blaisorblade wrote:
> > Ok, I hope I remembered correctly how to debug such faults (I'm posting
> > the full procedure so you can give a look)
>
> Correct.
>
> > 0x00000000619a592f:     mov    %edx,%fs:(%rcx)  #faulting instruction.
>
> This and the registers involved are usually all you need.
>
> > RCX there is (long)regs->skas.regs[11] = -64, and for FS, since HOST_FS =
> > 25, I get:
> >
> > print/x regs->skas.regs[25]
> > $45 = 0x63
Since include/asm-x86_64/segment.h has:
#define FS_TLS_SEL ((GDT_ENTRY_TLS_MIN+FS_TLS)*8 + 3)
(which gives 0x63), this is the default value, ok.

> The presence of %fs in the instruction immediately suggests a TLS problem.
Yes, but the problem depends on miscompilation of UML (or rather, on using a 
different compiler), and arch_prctl_skas does not do a lot of work on x86_64 
(for what I can see). I'm astonished by the fact that less users complain 
about TLS on amd64 than on x86 (there are much less users, ok, but the 
difference is too high).

> Also, the trap number in cases like this should be 13, rather than the
> 14 you get with a normal page fault.

I remember that I saw 14 there, indeed. And indeed, it is so. See below:

(gdb) print regs->skas
$5 = {regs = {1, 547608288328, 547608288344, 4294967295, 547608288096, 
1077341640, 514, 0, 3399988123389603631, 18374403900871474943, 
18446744073709551614, 18446744073709551552, 2,
    0, 1076283584, 18446744073709551615, 1078130991, 51, 66067, 547608288008, 
43, 0, 0, 0, 0, 99, 0}, fp = {895, 0, 0, 281470681751424, 0 <repeats 60 
times>, 140733672859384},
  faultinfo = {error_code = 6, cr2 = 1613887512, trap_no = 14}, syscall = -1, 
is_user = 1}
(gdb) print/x regs->skas
$7 = {regs = {0x1, 0x7f7fff5c48, 0x7f7fff5c58, 0xffffffff, 0x7f7fff5b60, 
0x4036edc8, 0x202, 0x0, 0x2f2f2f2f2f2f2f2f, 0xfefefefefefefeff, 
0xfffffffffffffffe, 0xffffffffffffffc0,
    0x2, 0x0, 0x4026c8c0, 0xffffffffffffffff, 0x4042f92f, 0x33, 0x10213, 
0x7f7fff5b08, 0x2b, 0x0, 0x0, 0x0, 0x0, 0x63, 0x0}, fp = {0x37f, 0x0, 0x0, 
0xffff00001f80,
    0x0 <repeats 60 times>, 0x7fff1c9426f8}, faultinfo = {error_code = 0x6, 
cr2 = 0x6031f818, trap_no = 0xe}, syscall = 0xffffffffffffffff, is_user = 
0x1}

Then, new things:
(gdb) print $rsp
$9 = (void *) 0x60a23da0

> %fs isn't 0, so that's one thing that's not wrong.  What's in the
> corresponding segment?

regs[FS_BASE,GS_BASE] are both 0.
Instead, from strace I can see that
brk(0)                                  = 0x6031f000
brk(0x6031f880)                         = 0x6031f880
arch_prctl(ARCH_SET_FS, 0x6031f858)     = 0
is called at UML boot, very early, by glibc.

All of 4 runs have the same address in this call (no randomization on this? 
Strange!).

On the working binary, instead, this is what I get:
brk(0)                                  = 0x6031b000
brk(0x6031b880)                         = 0x6031b880 # malloc?
arch_prctl(ARCH_SET_FS, 0x6031b858)     = 0 #use the result, with some offset.

I do not think the exact address makes a difference (it will likely depend on 
the binary size).

I'll give a better look at some later time. Bye!
-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
Chiacchiera con i tuoi amici in tempo reale! 
 http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com 


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to