On 12 October 2013 11:08, Norbert Hartl <[email protected]> wrote: > So, finally it turned out that the culprit is in my own code. I was > logging exception objects that have a signaler context pointing to the > socket. This way every connection timeout I added the exception to a > collection preventing unregistering of external resources. > > Congratulations finding the bug! :)
> Norbert > > Am 11.10.2013 um 15:02 schrieb Norbert Hartl <[email protected]>: > > > > > > > Am 11.10.2013 um 10:53 schrieb Sven Van Caekenberghe <[email protected]>: > > > >> > >> On 11 Oct 2013, at 10:24, Norbert Hartl <[email protected]> wrote: > >> > >>> I can report that the behavior is different now. There were two new vm > releases this week in ppa. The first one didn't work but the second changed > something. My application was never running that long. It is more than a > day now having an actual external objects table size of 623 which wasn't > ever reached before. So I would say that there is chance that this > particular problem is gone. I monitor this further and I think that this > wasn't the only problem. But then it is another problem. > >> > >> Yeah, but not knowing your application load, 623, which would be about > 200 sockets (3 semaphores per sockets), is still a lot to be active at the > same time. Can you in some way invoke a full GC externally, like using > ZnReadEvalPrintDelegate and see if it eventually drops due to finalization > ? It should, at least that is what I see. > >> > > Yes, that's what I meant. There is always only one outgoing connection > at a time. Every 15 seconds one request is issued. So you see why expect > more to find. > > I'm travelling right now and will have a deeper look after being back > > > > Norbert > >>> Thanks to all of you who've helped solving this. If it comes to the VM > being the source of problems it is always extra annoying because it is way > harder to change something there. > >>> > >>> Norbert > >>> > >>> > >>> Am 08.10.2013 um 11:27 schrieb Igor Stasenko <[email protected]>: > >>> > >>>> > >>>> > >>>> > >>>> On 7 October 2013 18:36, Norbert Hartl <[email protected]> wrote: > >>>> > >>>> Am 07.10.2013 um 16:36 schrieb Igor Stasenko <[email protected]>: > >>>> > >>>>> 1 thing. > >>>>> > >>>>> can you tell me what given expression yields for your VM/image: > >>>>> > >>>>> Smalltalk vm maxExternalSemaphores > >>>>> > >>>>> (if it gives you number less than 10000000 then i think i know what > is your problem :) > >>>> It is 10000000 > >>>> > >>>> What would be the problem if it would be smaller? > >>>> > >>>> > >>>> that just means your VM don't have external object size cap. > >>>> I changed the implementation to not have hard limit (the arbitrary > large number > >>>> is there just to be "compatible" with previous implementation). > >>>> > >>>> This means, that you can actually change in your image the check and > completely ignore limits > >>>> and just keep growing if it necessary. > >>>> > >>>> Now, since you using VM which don't have a limit, but problem still > persists, > >>>> it seems like it somewhere else.. :/ > >>>>> i just found that after one merge, my changes get lost > >>>>> we're just plugged them back in, and it should be back again with > newer VMs.. > >>>>> but the problem could be more than just semaphores.. if merge broken > this, it may break > >>>>> many other things, so we need time to check > >>>> I try to look at it some more time. I'm using the pharo-vm from the > launchpad build. Are the changes supposed to be in this one? > >>>> > >>>> Norbert > >>>> > >>>> Launchpad? You mean ppa? I can't say i remember all the details how > changes to VM source > >>>> gets into ppa distro, and how fast they get there. @Damien, can you > enlighten us? > >>>> > >>>> > >>>> Well, the VM which i downloaded recently using zero-conf script, > having limit back to 256. Just some merge mistake, which now is fixed.. > means that couple builds will use limit-based implementation.. but then > >>>> it will be back to my implementaiton. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On 7 October 2013 12:31, Norbert Hartl <[email protected]> wrote: > >>>>> > >>>>> Am 07.10.2013 um 11:28 schrieb Henrik Johansen < > [email protected]>: > >>>>> > >>>>>> > >>>>>> On Oct 7, 2013, at 11:16 , Norbert Hartl <[email protected]> > wrote: > >>>>>> > >>>>>>> As I need an image that runs longer than 24 hours I'm looking at > some stuff and wonder. Can anybody explain me the rationale for a code like > this > >>>>>>> > >>>>>>> maxExternalSemaphores: aSize > >>>>>>> "This method should never be called as result of normal program > >>>>>>> execution. If it is however, handle it differently: > >>>>>>> - In development, signal an error to promt user to set a bigger > size > >>>>>>> at startup immediately. > >>>>>>> - In production, accept the cost of potentially unhandled > interrupts, > >>>>>>> but log the action for later review. > >>>>>>> > >>>>>>> See comment in maxExternalObjectsSilently: why this behaviour is > >>>>>>> desirable, " > >>>>>>> "Can't find a place where development/production is decided. > >>>>>>> Suggest Smalltalk image inProduction, but use an overridable temp > >>>>>>> meanwhile. " > >>>>>>> | inProduction | > >>>>>>> self maxExternalSemaphores > >>>>>>> ifNil: [^ 0]. > >>>>>>> inProduction := false. > >>>>>>> ^ inProduction > >>>>>>> ifTrue: [self maxExternalSemaphoresSilently: aSize. > >>>>>>> self crTrace: 'WARNING: Had to increase size of > semaphore signal handling table due to many external objects concurrently > in use'; > >>>>>>> crTrace: 'You should increase this size at startup > using #maxExternalObjectsSilently:'; > >>>>>>> crTrace: 'Current table size: ' , self > maxExternalSemaphores printString] > >>>>>>> ifFalse: ["Smalltalk image" > >>>>>>> self error: 'Not enough space for external objects, set > a larger size at startup!' > >>>>>>> "Smalltalk image"] > >>>>>>> > >>>>>>> I have reported this once but got no feedback so I like to have a > few opinions. > >>>>>>> > >>>>>>> The report is here: https://pharo.fogbugz.com/f/cases/10839/ > >>>>>>> > >>>>>>> Norbert > >>>>>> > >>>>>> The rationale is that inProduction would be some global setting, > not yet in place when the code was written… > >>>>>> Excessive simultaneous Semaphore usage is something that should be > caught during development, in which case it's better to get an active > notification, than having it logged somewhere. > >>>>> > >>>>> Agreed. But didn't work in my case because it needed roughly 20 > hours and an instable remote backend to trigger the problem. And somehow I > forgot to install my logger as Transcript so there is no warning message. I > saw only dead images in the morning. > >>>>> This not satisfactory but on the other hand this type of problems > are hard to solve anyway. My feeling tells me there is more to discover. > Sockets resources get unregistered at finalization time but this didn't > work either. I would have said that the unlikely situation that no garbage > collection ran could be the case. But it can't because in > ExternalSemaphoreTable>>#freedSlotsIn:ratherThanIncreaseSizeTo: there is > explicit garbage collection. > >>>>> > >>>>>> If I've understood correctly, it's moot on newer Pharo VM's, where > there's no limit on the semtable size, but for legacy code a startup item > setting size using maxExternalObjectsSilently: (as suggested in the Warning > text), is still a more proper fix than setting inProduction to true and > crossing your fingers hoping no signals will be lost during table growth. > >>>>> > >>>>> Ah, I didn't know about the risk of loosing signals while resizing > the table. Thanks for that. Don't get me wrong I wasn't proposing to set > inProduction in effect. I don't think that automatically growing resource > management is a proper way to design a system. There is always a range of > resources you need for your use case. Not setting an upper bound for this > just covers leaking behavior. > >>>>> > >>>>> Norbert > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Best regards, > >>>>> Igor Stasenko. > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Best regards, > >>>> Igor Stasenko. > >> > >> > > > > > -- Best regards, Igor Stasenko.
