On 11 Oct 2013, at 10:24, Norbert Hartl <[email protected]> wrote:

> I can report that the behavior is different now. There were two new vm 
> releases this week in ppa. The first one didn't work but the second changed 
> something. My application was never running that long. It is more than a day 
> now having an actual external objects table size of 623 which wasn't ever 
> reached before. So I would say that there is chance that this particular 
> problem is gone. I monitor this further and I think that this wasn't the only 
> problem. But then it is another problem.

Yeah, but not knowing your application load, 623, which would be about 200 
sockets (3 semaphores per sockets), is still a lot to be active at the same 
time. Can you in some way invoke a full GC externally, like using 
ZnReadEvalPrintDelegate and see if it eventually drops due to finalization ? It 
should, at least that is what I see.

> Thanks to all of you who've helped solving this. If it comes to the VM being 
> the source of problems it is always extra annoying because it is way harder 
> to change something there.
> 
> Norbert
>  
> 
> Am 08.10.2013 um 11:27 schrieb Igor Stasenko <[email protected]>:
> 
>> 
>> 
>> 
>> On 7 October 2013 18:36, Norbert Hartl <[email protected]> wrote:
>> 
>> Am 07.10.2013 um 16:36 schrieb Igor Stasenko <[email protected]>:
>> 
>>> 1 thing.
>>> 
>>> can you tell me what given expression yields for your VM/image:
>>> 
>>> Smalltalk vm maxExternalSemaphores
>>> 
>>> (if it gives you number less than 10000000 then i think i know what is your 
>>> problem :)
>>> 
>> It is 10000000
>> 
>> What would be the problem if it would be smaller?
>> 
>> 
>> that just means your VM don't have external object size cap.
>> I changed the implementation to not have hard limit (the arbitrary large 
>> number
>> is there just to be "compatible" with previous implementation).
>> 
>> This means, that you can actually change in your image the check and 
>> completely ignore limits 
>> and just keep growing if it necessary. 
>> 
>> Now, since you using VM which don't have a limit, but problem still persists,
>> it seems like it somewhere else.. :/ 
>>> i just found that after one merge, my changes get lost
>>> we're just plugged them back in, and it should be back again with newer 
>>> VMs..
>>> but the problem could be more than just semaphores.. if merge broken this, 
>>> it may break 
>>> many other things, so we need time to check
>>> 
>> I try to look at it some more time. I'm using the pharo-vm from the 
>> launchpad build. Are the changes supposed to be in this one?
>> 
>> Norbert
>> 
>> Launchpad? You mean ppa? I can't say i remember all the details how changes 
>> to VM source
>> gets into ppa distro, and how fast they get there. @Damien, can you 
>> enlighten us?
>> 
>> 
>> Well, the VM which i downloaded recently using zero-conf script, having 
>> limit back to 256. Just some merge mistake, which now is fixed.. means that 
>> couple builds will use limit-based implementation.. but then 
>> it will be back to my implementaiton.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 7 October 2013 12:31, Norbert Hartl <[email protected]> wrote:
>>> 
>>> Am 07.10.2013 um 11:28 schrieb Henrik Johansen 
>>> <[email protected]>:
>>> 
>>>> 
>>>> On Oct 7, 2013, at 11:16 , Norbert Hartl <[email protected]> wrote:
>>>> 
>>>>> As I need an image that runs longer than 24 hours I'm looking at some 
>>>>> stuff and wonder. Can anybody explain me the rationale for a code like 
>>>>> this
>>>>> 
>>>>> maxExternalSemaphores: aSize 
>>>>>   "This method should never be called as result of normal program
>>>>>   execution. If it is however, handle it differently:
>>>>>   - In development, signal an error to promt user to set a bigger size
>>>>>   at startup immediately.
>>>>>   - In production, accept the cost of potentially unhandled interrupts,
>>>>>   but log the action for later review.
>>>>>   
>>>>>   See comment in maxExternalObjectsSilently: why this behaviour is
>>>>>   desirable, "
>>>>>   "Can't find a place where development/production is decided.
>>>>>   Suggest Smalltalk image inProduction, but use an overridable temp
>>>>>   meanwhile. "
>>>>>   | inProduction |
>>>>>   self maxExternalSemaphores
>>>>>           ifNil: [^ 0].
>>>>>   inProduction := false.
>>>>>   ^ inProduction
>>>>>           ifTrue: [self maxExternalSemaphoresSilently: aSize.
>>>>>                   self crTrace: 'WARNING: Had to increase size of 
>>>>> semaphore signal handling table due to many external objects concurrently 
>>>>> in use';
>>>>>                            crTrace: 'You should increase this size at 
>>>>> startup using #maxExternalObjectsSilently:';
>>>>>                            crTrace: 'Current table size: ' , self 
>>>>> maxExternalSemaphores printString]
>>>>>           ifFalse: ["Smalltalk image"
>>>>>                   self error: 'Not enough space for external objects, set 
>>>>> a larger size at startup!'
>>>>>                   "Smalltalk image"]
>>>>> 
>>>>> I have reported this once but got no feedback so I like to have a few 
>>>>> opinions.
>>>>> 
>>>>> The report is here: https://pharo.fogbugz.com/f/cases/10839/
>>>>> 
>>>>> Norbert
>>>> 
>>>> The rationale is that inProduction would be some global setting, not yet 
>>>> in place when the code was written…
>>>> Excessive simultaneous Semaphore usage is something that should be caught 
>>>> during development, in which case it's better to get an active 
>>>> notification, than having it logged somewhere.
>>> 
>>> Agreed. But didn't work in my case because it needed roughly 20 hours and 
>>> an instable remote backend to trigger the problem. And somehow I forgot to 
>>> install my logger as Transcript so there is no warning message. I saw only 
>>> dead images in the morning. 
>>> This not satisfactory but on the other hand this type of problems are hard 
>>> to solve anyway. My feeling tells me there is more to discover. Sockets 
>>> resources get unregistered at finalization time but this didn't work 
>>> either. I would have said that the unlikely situation that no garbage 
>>> collection ran could be the case. But it can't because in 
>>> ExternalSemaphoreTable>>#freedSlotsIn:ratherThanIncreaseSizeTo: there is 
>>> explicit garbage collection. 
>>> 
>>>> If I've understood correctly, it's moot on newer Pharo VM's, where there's 
>>>> no limit on the semtable size, but for legacy code a startup item setting 
>>>> size using maxExternalObjectsSilently: (as suggested in the Warning text), 
>>>> is still a more proper fix than setting inProduction to true and crossing 
>>>> your fingers hoping no signals will be lost during table growth.
>>> 
>>> Ah, I didn't know about the risk of loosing signals while resizing the 
>>> table. Thanks for that. Don't get me wrong I wasn't proposing to set 
>>> inProduction in effect. I don't think that automatically growing resource 
>>> management is a proper way to design a system. There is always a range of 
>>> resources you need for your use case. Not setting an upper bound for this 
>>> just covers leaking behavior.
>>> 
>>> Norbert
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Best regards,
>>> Igor Stasenko.
>> 
>> 
>> 
>> 
>> -- 
>> Best regards,
>> Igor Stasenko.
> 


Reply via email to