Or here I am with a better fix rather than changing the semantics of an existing function (what was I thinking!?).
Still using Dobby's patch, change the way the timeout is set from
timeout.sec = 10; timeout.usec = 0;
to
Ns_GetTime(&timeout); timeout.sec += 10;
rob
Rob Crittenden wrote:
It's failing because the timeout isn't setup properly. The timeout value to pthread_cond_timedwait() is based on an absolute time, not a relative time. So I guess the right thing to do is modify Ns_CondTimedWait() to fetch the local time and add the timeout to that.
I was able to duplicate the problem myself and this patch resolves it. I guess the next step is to add a configuration option for the time to wait.
Either that or use a simple CondWait() instead.
This along with Dossy's patch fixes it for me:
Index: pthread.c =================================================================== RCS file: /cvsroot/aolserver/aolserver/nsthread/pthread.c,v retrieving revision 1.4 diff -u -r1.4 pthread.c --- pthread.c 7 Mar 2003 18:08:52 -0000 1.4 +++ pthread.c 23 Feb 2004 16:11:48 -0000 @@ -641,19 +641,22 @@ { int err, status = NS_ERROR; struct timespec ts; + struct timeval now;
if (timePtr == NULL) { Ns_CondWait(cond, mutex); return NS_OK; }
+ gettimeofday(&now, NULL); + /* * Convert the microsecond-based Ns_Time to a nanosecond-based * struct timespec. */
- ts.tv_sec = timePtr->sec; - ts.tv_nsec = timePtr->usec * 1000; + ts.tv_sec = now.tv_sec + timePtr->sec; + ts.tv_nsec = now.tv_sec + timePtr->usec * 1000;
/* * As documented on Linux, pthread_cond_timedwait may return
Dossy wrote:
On 2004.02.23, Taguchi Takeshi <[EMAIL PROTECTED]> wrote:
New error occurs.
[23/Feb/2004:15:13:51][84562.134557696][-main-] Notice: driver: starting: nssock nsthreads: pthread_cond_timedwait failed in Ns_CondTimedWait: Operation not permitted Abort
I think error codes for pthread_cond_timedwait(3) are only EINVAL, and ETIMEDOUT. But it's seem EPERM. Does this function require root priv???
No, I'm guessing it doesn't like it if you haven't grabbed the mutex first -- that's probably why it's giving EPERM.
Maybe I'm just too tired to think clearly, but this code doesn't make sense. Aren't mutexes in AOLserver, well, "mutually exclusive" -- only one thread can acquire the mutex at a time? If that's the case, then if the thread which executes the code that waits for the other thread to set a boolean does this by grabbing the mutex *first* then doing a timed condwait, but the other thread that's supposed to set the boolean also attempts to grab the mutex then send a condbroadcast ... well, it'll never get to set the boolean and condbroadcast until the first thread unlocks the mutex, which it won't do until the condwait times out ...
Obviously, I must be understanding mutex behavior wrong, because if I'm right, things like NsWaitDriversShutdown() will also hang until timeout in some cases -- which may explain why every few times I ns_shutdown my 4.1 nsd's, it hangs. Hmm.
Need to put more thought into this today. Maybe someone else will speak up and solve the problem in the meantime ...
-- Dossy
-- Dossy Shiobara mail: [EMAIL PROTECTED] Panoptic Computer Network web: http://www.panoptic.com/ "He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on." (p. 70)
-- AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
-- AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
-- AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.
