can be a valuable information for some of you.
Would be good to fix it for real.

Stef

Begin forwarded message:

> From: Andreas Raab <[email protected]>
> Date: August 14, 2011 7:58:06 PM GMT+02:00
> To: [email protected]
> Subject: Re: [Vm-dev] Socket's readSemaphore is losing signals with Cog on 
> Linux
> Reply-To: Squeak Virtual Machine Development Discussion 
> <[email protected]>
> 
> On 8/13/2011 13:42, Levente Uzonyi wrote:
>> Socket's readSemaphore is losing signals with CogVMs on linux. We found 
>> several cases (RFB, PostgreSQL) when processes are stuck in the following 
>> method:
>> 
>> Socket >> waitForDataIfClosed: closedBlock
>>   "Wait indefinitely for data to arrive.  This method will block until
>>   data is available or the socket is closed."
>> 
>>   [
>>       (self primSocketReceiveDataAvailable: socketHandle)
>>           ifTrue: [^self].
>>       self isConnected
>>           ifFalse: [^closedBlock value].
>>       self readSemaphore wait ] repeat
>> 
>> When we inspect the contexts, the process is waiting for the readSemaphore, 
>> but evaluating (self primSocketReceiveDataAvailable: socketHandle) yields 
>> true. Signaling the readSemaphore makes the process running again. As a 
>> workaround we replaced #wait with #waitTimeoutMSecs: and all our problems 
>> disappeared.
>> 
>> The interpreter VM doesn't seem to have this bug, so I guess the bug was 
>> introduced with the changes of aio.c.
> 
> Oh, interesting. We know this problem fairly well and have always worked 
> around by changing the wait in the above to a "waitTimeoutMSecs: 500" which 
> turns it into a soft busy loop. It would be interesting to see if there's a 
> bug in Cog which causes this. FWIW, here is the relevant portion:
> 
>           "Soft 500ms busy loop - to protect against AIO probs;
>           occasionally, VM-level AIO fails to trip the semaphore"
>           self readSemaphore waitTimeoutMSecs: 500.
> 
> Cheers,
> - Andreas
> 


Reply via email to