Hi Norbert,

I come a bit late in the discussion, but here are my ideas.

First, I think you did a great job chasing this problem. What you found and 
your solution are quite useful I think.

I am wondering, thinking about some other issues.

Does the problem only occur in a ConnectionTimedOut during #connectTo:port: or 
in any situation when either ConnectionTimedOut or ConnectionClosed are 
signalled (so during regular reading and/or writing) ?

Resource cleanup during exceptional situation is notoriously difficult.

One view (even in your original case) could be that it is still the user of the 
Socket[Stream] has the responsibility to manage the resource.

There is also a strange dichotomy in managing Sockets as an external resource. 
First, there is classic, manual resource management. Second, there is 
finalization (#finalize). In Pharo you can be a bit lazy and trust on 
finalization. But of course it is more correct and efficient to do explicit 
management.

I have also often wondered how and when the finalization kicks in. I have been 
monitoring the number of nonNil slots in the semaphore table and been 
triggering full GC's on long running images. My hunch is that it does work, but 
in many cases the GC comes too late, i.e. memory consumption is way slower that 
exhaustion of the table.

My 2c,

Sven

On 06 Oct 2013, at 12:51, Norbert Hartl <[email protected]> wrote:

> I took some time to analyze my current problem with external semaphores. I 
> was just reluctant to raise the limit in my image because I want the problem 
> solved. I logged the management of external semaphores and discovered that 
> the table fills if a connection times out. 
> 
> The problem turns out to be in 
> 
> Socket>>#connectTo: hostAddress port: port waitForConnectionFor: timeout 
>       "Initiate a connection to the given port at the given host 
>       address. Waits until the connection is established or time outs."
>       self connectNonBlockingTo: hostAddress port: port.
>       self
>               waitForConnectionFor: timeout
>               ifTimedOut: [ConnectionTimedOut signal: 'Cannot connect to '
>                                       , (NetNameResolver stringFromAddress: 
> hostAddress) , ':' , port asString]
> 
> When a socket is created three external semaphores are registered in the 
> ExternalSemaphoreTable. If a connection times out the exception is thrown but 
> the Socket still has his resources attached. 
> 
> So e.g. in 
> 
> SocketStream class>>#openConnectionToHost: hostIP port: portNumber timeout: 
> timeout
>       | socket |
>       socket := Socket new.
>       socket connectTo: hostIP port: portNumber waitForConnectionFor: timeout.
>       ^self on: socket
> 
> it holds locally a socket (with semaphores registered) but on exception time 
> the reference to the socket gets lost and the semaphores stay registered. The 
> only way to unregister is on finalization time but I think it should work 
> better. So I would add a destroy before the exception is raised.
> 
> Socket>>#connectTo: hostAddress port: port waitForConnectionFor: timeout 
>       "Initiate a connection to the given port at the given host 
>       address. Waits until the connection is established or time outs."
>       self connectNonBlockingTo: hostAddress port: port.
>       self
>               waitForConnectionFor: timeout
>               ifTimedOut: [
>                       self destroy.
>                       ConnectionTimedOut signal: 'Cannot connect to '
>                                       , (NetNameResolver stringFromAddress: 
> hostAddress) , ':' , port asString]
> 
> I opened a ticket at https://pharo.fogbugz.com/f/cases/11797 but I'm not sure 
> how I am supposed to provide fixes made against a pharo2.0 image. Probably I 
> should fix this againt 3.0 but then I'm still a 2.0 user :) 
> 
> Norbert


Reply via email to