Re: [Pharo-dev] external semaphores…again

Norbert Hartl Fri, 11 Oct 2013 06:06:27 -0700


Am 11.10.2013 um 10:53 schrieb Sven Van Caekenberghe <[email protected]>:


> 
> On 11 Oct 2013, at 10:24, Norbert Hartl <[email protected]> wrote:
> 
>> I can report that the behavior is different now. There were two new vm 
>> releases this week in ppa. The first one didn't work but the second changed 
>> something. My application was never running that long. It is more than a day 
>> now having an actual external objects table size of 623 which wasn't ever 
>> reached before. So I would say that there is chance that this particular 
>> problem is gone. I monitor this further and I think that this wasn't the 
>> only problem. But then it is another problem.
> 
> Yeah, but not knowing your application load, 623, which would be about 200 
> sockets (3 semaphores per sockets), is still a lot to be active at the same 
> time. Can you in some way invoke a full GC externally, like using 
> ZnReadEvalPrintDelegate and see if it eventually drops due to finalization ? 
> It should, at least that is what I see.
> 
Yes, that's what I meant. There is always only one outgoing connection at a 
time. Every 15 seconds one request is issued. So you see why expect more to 
find.
I'm travelling right now and will have a deeper look after being back

Norbert
>> Thanks to all of you who've helped solving this. If it comes to the VM being 
>> the source of problems it is always extra annoying because it is way harder 
>> to change something there.
>> 
>> Norbert
>> 
>> 
>> Am 08.10.2013 um 11:27 schrieb Igor Stasenko <[email protected]>:
>> 
>>> 
>>> 
>>> 
>>> On 7 October 2013 18:36, Norbert Hartl <[email protected]> wrote:
>>> 
>>> Am 07.10.2013 um 16:36 schrieb Igor Stasenko <[email protected]>:
>>> 
>>>> 1 thing.
>>>> 
>>>> can you tell me what given expression yields for your VM/image:
>>>> 
>>>> Smalltalk vm maxExternalSemaphores
>>>> 
>>>> (if it gives you number less than 10000000 then i think i know what is 
>>>> your problem :)
>>> It is 10000000
>>> 
>>> What would be the problem if it would be smaller?
>>> 
>>> 
>>> that just means your VM don't have external object size cap.
>>> I changed the implementation to not have hard limit (the arbitrary large 
>>> number
>>> is there just to be "compatible" with previous implementation).
>>> 
>>> This means, that you can actually change in your image the check and 
>>> completely ignore limits 
>>> and just keep growing if it necessary. 
>>> 
>>> Now, since you using VM which don't have a limit, but problem still 
>>> persists,
>>> it seems like it somewhere else.. :/ 
>>>> i just found that after one merge, my changes get lost
>>>> we're just plugged them back in, and it should be back again with newer 
>>>> VMs..
>>>> but the problem could be more than just semaphores.. if merge broken this, 
>>>> it may break 
>>>> many other things, so we need time to check
>>> I try to look at it some more time. I'm using the pharo-vm from the 
>>> launchpad build. Are the changes supposed to be in this one?
>>> 
>>> Norbert
>>> 
>>> Launchpad? You mean ppa? I can't say i remember all the details how changes 
>>> to VM source
>>> gets into ppa distro, and how fast they get there. @Damien, can you 
>>> enlighten us?
>>> 
>>> 
>>> Well, the VM which i downloaded recently using zero-conf script, having 
>>> limit back to 256. Just some merge mistake, which now is fixed.. means that 
>>> couple builds will use limit-based implementation.. but then 
>>> it will be back to my implementaiton.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 7 October 2013 12:31, Norbert Hartl <[email protected]> wrote:
>>>> 
>>>> Am 07.10.2013 um 11:28 schrieb Henrik Johansen 
>>>> <[email protected]>:
>>>> 
>>>>> 
>>>>> On Oct 7, 2013, at 11:16 , Norbert Hartl <[email protected]> wrote:
>>>>> 
>>>>>> As I need an image that runs longer than 24 hours I'm looking at some 
>>>>>> stuff and wonder. Can anybody explain me the rationale for a code like 
>>>>>> this
>>>>>> 
>>>>>> maxExternalSemaphores: aSize 
>>>>>>    "This method should never be called as result of normal program
>>>>>>    execution. If it is however, handle it differently:
>>>>>>    - In development, signal an error to promt user to set a bigger size
>>>>>>    at startup immediately.
>>>>>>    - In production, accept the cost of potentially unhandled interrupts,
>>>>>>    but log the action for later review.
>>>>>>    
>>>>>>    See comment in maxExternalObjectsSilently: why this behaviour is
>>>>>>    desirable, "
>>>>>>    "Can't find a place where development/production is decided.
>>>>>>    Suggest Smalltalk image inProduction, but use an overridable temp
>>>>>>    meanwhile. "
>>>>>>    | inProduction |
>>>>>>    self maxExternalSemaphores
>>>>>>        ifNil: [^ 0].
>>>>>>    inProduction := false.
>>>>>>    ^ inProduction
>>>>>>        ifTrue: [self maxExternalSemaphoresSilently: aSize.
>>>>>>            self crTrace: 'WARNING: Had to increase size of semaphore 
>>>>>> signal handling table due to many external objects concurrently in use';
>>>>>>                 crTrace: 'You should increase this size at startup using 
>>>>>> #maxExternalObjectsSilently:';
>>>>>>                 crTrace: 'Current table size: ' , self 
>>>>>> maxExternalSemaphores printString]
>>>>>>        ifFalse: ["Smalltalk image"
>>>>>>            self error: 'Not enough space for external objects, set a 
>>>>>> larger size at startup!'
>>>>>>            "Smalltalk image"]
>>>>>> 
>>>>>> I have reported this once but got no feedback so I like to have a few 
>>>>>> opinions.
>>>>>> 
>>>>>> The report is here: https://pharo.fogbugz.com/f/cases/10839/
>>>>>> 
>>>>>> Norbert
>>>>> 
>>>>> The rationale is that inProduction would be some global setting, not yet 
>>>>> in place when the code was written…
>>>>> Excessive simultaneous Semaphore usage is something that should be caught 
>>>>> during development, in which case it's better to get an active 
>>>>> notification, than having it logged somewhere.
>>>> 
>>>> Agreed. But didn't work in my case because it needed roughly 20 hours and 
>>>> an instable remote backend to trigger the problem. And somehow I forgot to 
>>>> install my logger as Transcript so there is no warning message. I saw only 
>>>> dead images in the morning. 
>>>> This not satisfactory but on the other hand this type of problems are hard 
>>>> to solve anyway. My feeling tells me there is more to discover. Sockets 
>>>> resources get unregistered at finalization time but this didn't work 
>>>> either. I would have said that the unlikely situation that no garbage 
>>>> collection ran could be the case. But it can't because in 
>>>> ExternalSemaphoreTable>>#freedSlotsIn:ratherThanIncreaseSizeTo: there is 
>>>> explicit garbage collection. 
>>>> 
>>>>> If I've understood correctly, it's moot on newer Pharo VM's, where 
>>>>> there's no limit on the semtable size, but for legacy code a startup item 
>>>>> setting size using maxExternalObjectsSilently: (as suggested in the 
>>>>> Warning text), is still a more proper fix than setting inProduction to 
>>>>> true and crossing your fingers hoping no signals will be lost during 
>>>>> table growth.
>>>> 
>>>> Ah, I didn't know about the risk of loosing signals while resizing the 
>>>> table. Thanks for that. Don't get me wrong I wasn't proposing to set 
>>>> inProduction in effect. I don't think that automatically growing resource 
>>>> management is a proper way to design a system. There is always a range of 
>>>> resources you need for your use case. Not setting an upper bound for this 
>>>> just covers leaking behavior.
>>>> 
>>>> Norbert
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Best regards,
>>>> Igor Stasenko.
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Best regards,
>>> Igor Stasenko.
> 
>

Re: [Pharo-dev] external semaphores…again

Reply via email to