On Tue, Aug 23, 2011 at 11:34:58PM +0200, Dimitry Andric wrote: > On 2011-08-21 11:08, Test Rat wrote: > >I often get corrupted traces with clang world, the cause seems to be in > >rtld. > ... > > (gdb) bt > > #0 0x00000008009455ac in ?? () > > #1 0x0000000800944fa7 in ?? () > > After some digging, this turned out to be caused by the empty function > r_debug_state() in libexec/rtld-elf/rtld.c. This function is just a > necessary hook for gdb, but since it is completely empty, calls to it in > the same compilation unit simply don't generate any code, even if the > function is marked as __noinline. > > The attached patch fixes this, by marking the function __noinline, and > inserting an empty asm statement, that pretends to clobber memory. It > generates no extra code, and forces clang to emit calls to r_debug_state > throughout rtld.c. It looks rather hackish, though. > > An alternative solution would be to move the r_debug_state() function to > another .c file, which should work OK, until we eventually start using > link time optimization... :) > > > >And compiling rtld with clang + -O0 makes it crash. > > This is caused by yet another interesting problem, which is in the > _rtld() function in rtld.c. It is run at the very beginning of rtld, > when relocations have not yet been processed. This initial code must be > very careful to *not* use any relocated symbols, or problems will arise. > > The early initialization goes like: > > ... > > /* Initialize and relocate ourselves. */ > assert(aux_info[AT_BASE] != NULL); > init_rtld((caddr_t) aux_info[AT_BASE]->a_un.a_ptr, aux_info); > > __progname = obj_rtld.path; > argv0 = argv[0] != NULL ? argv[0] : "(null)"; > environ = env; > > The init_rtld() function takes care of the initial relocations, after > which 'global' symbols like __progname and environ can be used. > > However, at -O0, clang still reorders the retrieval of the __progname > offset to just *before* the init_rtld() call, and assigns it afterwards: > > ... > .LBB0_16: # %cond.end > movq __progname@GOTPCREL(%rip), %rax <-- gets the offset > here > leaq -224(%rbp), %rsi > .loc 1 329 5 # > /usr/src/libexec/rtld-elf/rtld.c:329:5 > movq -168(%rbp), %rcx > movq 8(%rcx), %rdi > movq %rax, -1504(%rbp) # 8-byte Spill <-- saves offset on > stack > callq init_rtld > .loc 1 331 5 # > /usr/src/libexec/rtld-elf/rtld.c:331:5 > movq obj_rtld+24(%rip), %rax > movq -1504(%rbp), %rcx # 8-byte Reload <-- loads offset > from stack > movq %rax, (%rcx) <-- stores value in > __progname > > It's not clear to me why clang does this reordering even when > optimization is off, but it is normally legal, and quite usual. > However, in case of this early initialization, it is fatal, as > __progname@GOTPCREL(%rip) will still be junk, or zero... > > With optimization, such reorderings are even more likely, but for some > reason, we have always been lucky that it turned out OK. A possible > solution would be to move the code after the init_rtld() call to another > function, and call that, but this could also be defeated again by > inlining. :( I think you can try to insert another compiler memory barrier after the init_rtld.
> Index: libexec/rtld-elf/rtld.c
> ===================================================================
> --- libexec/rtld-elf/rtld.c (revision 225105)
> +++ libexec/rtld-elf/rtld.c (working copy)
> @@ -143,7 +143,7 @@ static void ld_utrace_log(int, void *, void *, siz
> static void rtld_fill_dl_phdr_info(const Obj_Entry *obj,
> struct dl_phdr_info *phdr_info);
>
> -void r_debug_state(struct r_debug *, struct link_map *);
> +void r_debug_state(struct r_debug *, struct link_map *) __noinline;
>
> /*
> * Data declarations.
> @@ -2780,6 +2780,14 @@ linkmap_delete(Obj_Entry *obj)
> void
> r_debug_state(struct r_debug* rd, struct link_map *m)
> {
> + /*
> + * The following is a hack to force the compiler to emit calls to
> + * this function, even when optimizing. If the function is empty,
> + * the compiler is not obliged to emit any code for calls to it,
> + * even when marked __noinline. However, gdb depends on those
> + * calls being made.
> + */
> + __asm __volatile("" : : : "memory");
> }
This is a reasonable change, IMO.
Also, we still compile rtld and csu in the hosted environment, which is the
lie to the compiler.
pgpCkG4ykz2qC.pgp
Description: PGP signature
