On 11 Oct 2013, at 10:24, Norbert Hartl <[email protected]> wrote: > I can report that the behavior is different now. There were two new vm > releases this week in ppa. The first one didn't work but the second changed > something. My application was never running that long. It is more than a day > now having an actual external objects table size of 623 which wasn't ever > reached before. So I would say that there is chance that this particular > problem is gone. I monitor this further and I think that this wasn't the only > problem. But then it is another problem.
Yeah, but not knowing your application load, 623, which would be about 200 sockets (3 semaphores per sockets), is still a lot to be active at the same time. Can you in some way invoke a full GC externally, like using ZnReadEvalPrintDelegate and see if it eventually drops due to finalization ? It should, at least that is what I see. > Thanks to all of you who've helped solving this. If it comes to the VM being > the source of problems it is always extra annoying because it is way harder > to change something there. > > Norbert > > > Am 08.10.2013 um 11:27 schrieb Igor Stasenko <[email protected]>: > >> >> >> >> On 7 October 2013 18:36, Norbert Hartl <[email protected]> wrote: >> >> Am 07.10.2013 um 16:36 schrieb Igor Stasenko <[email protected]>: >> >>> 1 thing. >>> >>> can you tell me what given expression yields for your VM/image: >>> >>> Smalltalk vm maxExternalSemaphores >>> >>> (if it gives you number less than 10000000 then i think i know what is your >>> problem :) >>> >> It is 10000000 >> >> What would be the problem if it would be smaller? >> >> >> that just means your VM don't have external object size cap. >> I changed the implementation to not have hard limit (the arbitrary large >> number >> is there just to be "compatible" with previous implementation). >> >> This means, that you can actually change in your image the check and >> completely ignore limits >> and just keep growing if it necessary. >> >> Now, since you using VM which don't have a limit, but problem still persists, >> it seems like it somewhere else.. :/ >>> i just found that after one merge, my changes get lost >>> we're just plugged them back in, and it should be back again with newer >>> VMs.. >>> but the problem could be more than just semaphores.. if merge broken this, >>> it may break >>> many other things, so we need time to check >>> >> I try to look at it some more time. I'm using the pharo-vm from the >> launchpad build. Are the changes supposed to be in this one? >> >> Norbert >> >> Launchpad? You mean ppa? I can't say i remember all the details how changes >> to VM source >> gets into ppa distro, and how fast they get there. @Damien, can you >> enlighten us? >> >> >> Well, the VM which i downloaded recently using zero-conf script, having >> limit back to 256. Just some merge mistake, which now is fixed.. means that >> couple builds will use limit-based implementation.. but then >> it will be back to my implementaiton. >>> >>> >>> >>> >>> >>> On 7 October 2013 12:31, Norbert Hartl <[email protected]> wrote: >>> >>> Am 07.10.2013 um 11:28 schrieb Henrik Johansen >>> <[email protected]>: >>> >>>> >>>> On Oct 7, 2013, at 11:16 , Norbert Hartl <[email protected]> wrote: >>>> >>>>> As I need an image that runs longer than 24 hours I'm looking at some >>>>> stuff and wonder. Can anybody explain me the rationale for a code like >>>>> this >>>>> >>>>> maxExternalSemaphores: aSize >>>>> "This method should never be called as result of normal program >>>>> execution. If it is however, handle it differently: >>>>> - In development, signal an error to promt user to set a bigger size >>>>> at startup immediately. >>>>> - In production, accept the cost of potentially unhandled interrupts, >>>>> but log the action for later review. >>>>> >>>>> See comment in maxExternalObjectsSilently: why this behaviour is >>>>> desirable, " >>>>> "Can't find a place where development/production is decided. >>>>> Suggest Smalltalk image inProduction, but use an overridable temp >>>>> meanwhile. " >>>>> | inProduction | >>>>> self maxExternalSemaphores >>>>> ifNil: [^ 0]. >>>>> inProduction := false. >>>>> ^ inProduction >>>>> ifTrue: [self maxExternalSemaphoresSilently: aSize. >>>>> self crTrace: 'WARNING: Had to increase size of >>>>> semaphore signal handling table due to many external objects concurrently >>>>> in use'; >>>>> crTrace: 'You should increase this size at >>>>> startup using #maxExternalObjectsSilently:'; >>>>> crTrace: 'Current table size: ' , self >>>>> maxExternalSemaphores printString] >>>>> ifFalse: ["Smalltalk image" >>>>> self error: 'Not enough space for external objects, set >>>>> a larger size at startup!' >>>>> "Smalltalk image"] >>>>> >>>>> I have reported this once but got no feedback so I like to have a few >>>>> opinions. >>>>> >>>>> The report is here: https://pharo.fogbugz.com/f/cases/10839/ >>>>> >>>>> Norbert >>>> >>>> The rationale is that inProduction would be some global setting, not yet >>>> in place when the code was written… >>>> Excessive simultaneous Semaphore usage is something that should be caught >>>> during development, in which case it's better to get an active >>>> notification, than having it logged somewhere. >>> >>> Agreed. But didn't work in my case because it needed roughly 20 hours and >>> an instable remote backend to trigger the problem. And somehow I forgot to >>> install my logger as Transcript so there is no warning message. I saw only >>> dead images in the morning. >>> This not satisfactory but on the other hand this type of problems are hard >>> to solve anyway. My feeling tells me there is more to discover. Sockets >>> resources get unregistered at finalization time but this didn't work >>> either. I would have said that the unlikely situation that no garbage >>> collection ran could be the case. But it can't because in >>> ExternalSemaphoreTable>>#freedSlotsIn:ratherThanIncreaseSizeTo: there is >>> explicit garbage collection. >>> >>>> If I've understood correctly, it's moot on newer Pharo VM's, where there's >>>> no limit on the semtable size, but for legacy code a startup item setting >>>> size using maxExternalObjectsSilently: (as suggested in the Warning text), >>>> is still a more proper fix than setting inProduction to true and crossing >>>> your fingers hoping no signals will be lost during table growth. >>> >>> Ah, I didn't know about the risk of loosing signals while resizing the >>> table. Thanks for that. Don't get me wrong I wasn't proposing to set >>> inProduction in effect. I don't think that automatically growing resource >>> management is a proper way to design a system. There is always a range of >>> resources you need for your use case. Not setting an upper bound for this >>> just covers leaking behavior. >>> >>> Norbert >>> >>> >>> >>> >>> >>> -- >>> Best regards, >>> Igor Stasenko. >> >> >> >> >> -- >> Best regards, >> Igor Stasenko. >
