--On Sunday, December 29, 2002 3:03 PM -0800 [EMAIL PROTECTED] wrote:
This highlights another problem with our test suite. We can't have platform checks in our tests. Having these checks completely invalidates the usefulness of APR. These kinds of checks are the biggest reason that I haven't migrated testatomic to the new suite. I haven't had the time to figure out exactly what is going on yet.
Well, in this particular case, we have to go to some extremes to force certain schedulers (Solaris with LWPs) to actually go in parallel. If that call isn't there, Solaris will execute all threads in serial. When that happens, it invalidates the testatomic test, since we're explicitly trying to cause race conditions.
This is a side-effect of the Solaris scheduler rebalancing only on cancellation points (i.e. where the thread might block). The testatomic code never hits a cancellation point, but normal applications tend to do that. While I don't know what AIX's M:N thread library does, it wouldn't shock me if it has the same quirk. Solaris's N:N thread library (i.e. what Solaris 9 uses by default) shouldn't have this quirk.
I believe pthread_setconcurrency shouldn't be added to APR (in the past, Aaron and I have discussed this at length on this list), but its use should be allowed in the test suite because we have to provide hints because we're not a real program. A real program would be extremely hard-pressed to create code that never hits a cancellation point.
I believe we might be able to rewrite testatomic to use condition variables (don't believe we had them when we wrote testatomic initially), but I'm still not sure if that will get around the concurrency problem (do a wakeup_all and hope the scheduler allocates enough kernel threads for all the available userspace threads). And, doing a sleep() or sched_yield() doesn't help either as we have to get all of the user threads active at the same time. The only solution that consistently works is to use pthread_setconcurrency.
Try as you might, APR can not escape the quirks of the native OS. This is one particular case. -- justin