What did you use for ThreadsPerChild? The default? It's 25. The reason I ask is I did some scalability measurements maybe a year ago and saw a lot of CPU usage in linuxthreads with a high number (maybe 1000) for ThreadsPerChild. It was in some pthread mutex unlock function, doing a serial search for the highest priority thread to wake up. This is pointless in our case because all the worker threads are the same priority. I'm hoping that this problem is fixed in the new pthread library, but haven't verified it.
NPTL, for better or for worse, does not honor priorities. Quoting from <http://people.redhat.com/drepper/nptl-design.pdf>:
Realtime support is mostly missing from the library implementation. The system calls to select scheduling parameters are available but they have not effects. The reason for this is that large parts of the kernel do not follow the rules for realtime scheduling. Waking one of the threads waiting for a futex is not done by looking at the priorities of the waiters.
Scott Lamb
