> Greg Ames wrote:
> >
> > "Paul J. Reder" wrote:
> >
> > > By the way, when I start Apache then run ps -efH (with no server load) I get
>something like
> > > webadmin 21803 1 0 10:07 pts/3 00:00:00 httpd -d /home/webadmin/Apache
> (1 top
level Apache)
> > > webadmin 21805 21803 0 10:07 pts/3 00:00:00 httpd -d /home/webadmin/Apac
(Start_Server number of these)
> > > webadmin 21808 21805 0 10:07 pts/3 00:00:00 httpd -d /home/webadmin/Ap
> (1 per
Start_server)
> > > webadmin 21809 21808 0 10:07 pts/3 00:00:00 httpd -d /home/webadmin/
(threads_per_Child number of these
> > > webadmin 21812 21808 0 10:07 pts/3 00:00:00 httpd -d
n/ -
> > > webadmin 21815 21808 0 10:07 pts/3 00:00:00 httpd -d
/ -
> > > webadmin 21818 21808 0 10:07 pts/3 00:00:00 httpd -d
-
> > >
> > > I understand 21803 and I understand 21809 and its ilk. I also understand either
>21805 or 21808
> > > but not both. What am I missing in the way that processes and threads are
>handled in
APR/threaded mpm?
> > >
> >
> > I hacked up apr/test/testthread.c so that the threads sleep for several
> > seconds before exiting. This program creates 4 threads via
> > apr_thread_create. When all 4 are sleeping, I see:
> >
> > [gregames@gandalf httpd-2.0]$ ps ax -HO ppid,wchan | grep testthread
> > 3306 1152 rt_sig S pts/0 00:00:00 ./testthread
> > 3307 3306 do_pol S pts/0 00:00:00 ./testthread
> > 3308 3307 nanosl S pts/0 00:00:00 ./testthread
> > 3309 3307 nanosl S pts/0 00:00:00 ./testthread
> > 3310 3307 nanosl S pts/0 00:00:00 ./testthread
> > 3311 3307 nanosl S pts/0 00:00:00 ./testthread
> >
> > ...so it looks like Linux is creating an extra thread/process for us
> > (3307), probably when we do first pthread_create.
> >
> > Greg
>
> And according to my tests it is 3307 that is exiting / becoming defunct and leaving
>the
> worker threads orphaned. No core file, no log, no nothing...
>
I've not looked at Linux docs but I would expect that 3306 is the pid, 3307 is the
main thread and
3308 - 3311 are the 4 threads created with apr_thread_create. Paul, did you make the
modification
to child_main to
1. Eliminate the call to apr_create_signal_thread()
2. Have the main thread not call worker_main() and only exit when all the worker
threads have
exited?
If so, then perhaps the main thread is still exiting before the worker threads due to
a race
condition between when worker_thread_count is decremented and when the thread actually
exits. In
other words, a worker may decrement worker_thread_count, then be suspended, the next
worker
decrements worker_thread_count, then suspended, etc. Then the main thread wakes up,
sees that all
the workers "have exited" and exits. The problem is that the worker threads have NOT
really exited
yet, they've only decremented worker_thread_count. Windows has a nice signalling
mechanism that the
main thread can use to guarantee that the workers have exited. I suspect Unix has
something
similar.
Bill