On Mon, 25 Oct 2010, Vladimir Kirillov wrote:
> I've been trying to test rthreads and have hit some weird races
> using simple tests:
...
> I get this segfault almost always:
> 
> #0  pthread_exit (retval=0x0) at /usr/src/lib/librthread/rthread.c:223
> 223             for (clfn = thread->cleanup_fns; clfn; ) {
...
> Looks like some stupid race with two threads exiting at almost same
> time? Any ideas on tracking it down?

The problem was actually introduced during the c2k10 hackathon, where I 
changed getthrid() to always add THREAD_PID_OFFSET to the proc's real pid 
(which closes a race for pthread_kill)...but failed to teach fork1() to do 
that too.  The patch at bottom fixes this in my testing.


(I wasn't seeing this myself becauswe I'm normally running a severely 
hacked librthread that uses the platform's per-thread register to 
implement pthread_self() instead of having to walk the thread list.  
Sorry folks.  Time to get this stuff committed...)


> By the way, is such gdb behaviour normal? Does it need any additional
> patching before being useful to debug software using rthreads?

Oh yes, it needs lots of work.  I don't know if anyone had really looked 
closely at this yet.

Philip Guenther


Index: sys/kern/kern_fork.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_fork.c,v
retrieving revision 1.122
diff -u -p -r1.122 kern_fork.c
--- sys/kern/kern_fork.c        26 Jul 2010 01:56:27 -0000      1.122
+++ sys/kern/kern_fork.c        30 Oct 2010 22:52:39 -0000
@@ -480,7 +480,8 @@ fork1(struct proc *p1, int exitsig, int 
         * marking us as parent via retval[1].
         */
        if (retval != NULL) {
-               retval[0] = p2->p_pid;
+               retval[0] = p2->p_pid +
+                   (flags & FORK_THREAD ? THREAD_PID_OFFSET : 0);
                retval[1] = 0;
        }
        return (0);

Reply via email to