Source: glibc
Version: 2.19-7
Severity: important
User: debian-al...@lists.debian.org
Usertags: alpha
Justification: Fails to build from source but built in the past.

The test tst-eintr3 sometimes fails in the build of glibc on alpha
and has done so twice in a row in attempting to build 2.19-7.

It's an intermittant fault that appears to only occur on a
multiprocessor SMP system (which the buildd imago is).  Running the
test manually 40 or so times never failed when running a UP kernel.

To make testing faster I have used upstream glibc source on the 2.19
branch configuring with --enable-hardcoded-path-in-tests and running
tst-eintr3 with the --direct option.  It occasionally segfaults.
Getting a core dump and analysing with gdb gives the following:

Core was generated by `/home/mjc/toolchain/glibc-build/nptl/tst-eintr3 
--direct'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  start_thread (arg=0x2000121f1f0) at pthread_create.c:243
243       __resp = &pd->res;

(gdb) bt full
#0  start_thread (arg=0x2000121f1f0) at pthread_create.c:243
        pd = 0x2000121f1f0
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0 <repeats 17 times>}, 
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x2000003da00 
<start_thread>, 
              0x2000121f1f0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 
252416}}}
        not_first_call = <optimized out>
        robust = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#1  0x0000020000177d24 in thread_start ()
    at ../ports/sysdeps/unix/sysv/linux/alpha/clone.S:111
No locals.

(gdb) disass /m

Dump of assembler code for function start_thread:
232     {
   0x000002000003da00 <+0>:     ldah    gp,3(t12)
   0x000002000003da04 <+4>:     lda     gp,-14800(gp)
   0x000002000003da08 <+8>:     lda     sp,-240(sp)
   0x000002000003da14 <+20>:    stq     fp,40(sp)
   0x000002000003da18 <+24>:    mov     sp,fp
   0x000002000003da24 <+36>:    stq     s0,8(sp)
   0x000002000003da28 <+40>:    stq     ra,0(sp)
   0x000002000003da30 <+48>:    stq     s1,16(sp)
   0x000002000003da38 <+56>:    stq     s2,24(sp)
   0x000002000003da3c <+60>:    stq     s3,32(sp)
   0x000002000003da40 <+64>:    stq     a0,224(fp)

233       struct pthread *pd = (struct pthread *) arg;
234     
235     #if HP_TIMING_AVAIL
236       /* Remember the time when the thread was started.  */
237       hp_timing_t now;
238       HP_TIMING_NOW (now);
239       THREAD_SETMEM (pd, cpuclock_offset, now);
240     #endif
241     
242       /* Initialize resolver state pointer.  */
243       __resp = &pd->res;
   0x000002000003da0c <+12>:    rduniq
   0x000002000003da10 <+16>:    ldq     t0,-32656(gp)
   0x000002000003da20 <+32>:    addq    v0,t0,t0
   0x000002000003da2c <+44>:    lda     t1,1208(a0)
   0x000002000003da34 <+52>:    mov     v0,s0
=> 0x000002000003da44 <+68>:    stq     t1,0(t0)


The __resp variable appears to be a thread local variable being
accessed (well, written) by the initial exec TLS model.  The rduniq
PALcall should put the thread pointer (from the PCB) into register
v0.  Now let's check the address being written to at the point of
the segfault.

(gdb) print /x $t0
$1 = 0x18

That's definitely not a valid memory location since the first page of
memory starting at location 0 should be inaccessible.  Checking the
thread pointer:

(gdb) print /x $v0
$2 = 0x0

Ouch!  That looks like the thread pointer in the PCB has not been
initialised.

Running tst-eintr3 under gdb and setting a break point on line 243
reveals that, in general, the rduniq PALcall does return a valid
memory address (and presumably correctly the thread pointer), but,
occassionaly on an SMP system, it can return 0.  

This is as far as I have got with debugging.  Presumably there is a
wruniq PALcode call somewhere that sets up the thread pointer in the
PCB and that might be the next place to investigate what is going
on.

Cheers
Michael.


-- 
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140720102602.GB12779@omega

Reply via email to