can be a valuable information for some of you. Would be good to fix it for real. Stef Begin forwarded message: > From: Andreas Raab <[email protected]> > Date: August 14, 2011 7:58:06 PM GMT+02:00 > To: [email protected] > Subject: Re: [Vm-dev] Socket's readSemaphore is losing signals with Cog on > Linux > Reply-To: Squeak Virtual Machine Development Discussion > <[email protected]> > > On 8/13/2011 13:42, Levente Uzonyi wrote: >> Socket's readSemaphore is losing signals with CogVMs on linux. We found >> several cases (RFB, PostgreSQL) when processes are stuck in the following >> method: >> >> Socket >> waitForDataIfClosed: closedBlock >> "Wait indefinitely for data to arrive. This method will block until >> data is available or the socket is closed." >> >> [ >> (self primSocketReceiveDataAvailable: socketHandle) >> ifTrue: [^self]. >> self isConnected >> ifFalse: [^closedBlock value]. >> self readSemaphore wait ] repeat >> >> When we inspect the contexts, the process is waiting for the readSemaphore, >> but evaluating (self primSocketReceiveDataAvailable: socketHandle) yields >> true. Signaling the readSemaphore makes the process running again. As a >> workaround we replaced #wait with #waitTimeoutMSecs: and all our problems >> disappeared. >> >> The interpreter VM doesn't seem to have this bug, so I guess the bug was >> introduced with the changes of aio.c. > > Oh, interesting. We know this problem fairly well and have always worked > around by changing the wait in the above to a "waitTimeoutMSecs: 500" which > turns it into a soft busy loop. It would be interesting to see if there's a > bug in Cog which causes this. FWIW, here is the relevant portion: > > "Soft 500ms busy loop - to protect against AIO probs; > occasionally, VM-level AIO fails to trip the semaphore" > self readSemaphore waitTimeoutMSecs: 500. > > Cheers, > - Andreas >
