Dne 17.9.2015 v 14:16 Christian Mack napsal(a):
> Hello
> 
> Am 14.09.2015 um 11:38 schrieb Josef Škurek:
>>
>> One of /usr/sbin/sogod workers filing up WHOLE memory.
>>
>> in /var/log/sogo/sogo.log that process PID logs
>> Sep 14 11:07:34 sogod [8883]: 192.168.106.2 "GET /SOGo/ HTTP/1.1" 200
>> 4281/0 0.036 11290 62% -748K
>> Sep 14 11:07:39 sogod [8883]: <0x0x7ff663705cf8[NGImap4Client]> Note: no
>> key found for sorting, using 'DATE': (null)
>> Sep 14 11:08:38 sogod [5475]: [WARN] <0x0x7ff662ed87b8[WOWatchDogChild]>
>> pid 8883 has been hanging in the same request for 1 minutes
>> Sep 14 11:09:38 sogod [5475]: [WARN] <0x0x7ff662ed87b8[WOWatchDogChild]>
>> pid 8883 has been hanging in the same request for 2 minutes
>> Sep 14 11:10:38 sogod [5475]: [WARN]
>>
>> and the warning about hangup repeats on and on. I always end up killing
>> the process, but another pops up with the same problem, always with the
>> message
>>
>> Note: no key found for sorting, using 'DATE': (null)
>>
>> preceding the process freeze(insanity?), but so far for different users
>> connecting through web interface
>>
>> google-fu found
>> https://forum.zentyal.org/index.php?topic=23256.0
>> indicating it is somehow connected to user settings?
>>
>> Any ideas besides constant kill?
>>
> 
> How many RAM do you have?
> How many workers do you have?
> What did you set for SxVMemLimit?
> Do you have users zipping their mailboxes when your RAM is filling up?
> Do you have big /tmp/OGo* files who do not vanish after some minutes?
> 
> 
> Kind regards,
> Christian Mack
> 

Hello, Christian.

To answer your questions:
- Gen2 Hyper-V guest, tried more RAM, at 12GB now.
- Workers were 3, at 20 now. Problem seemed related to workers, a lot of
[WOWatchDog]> No child available to handle incoming request!
after en masse switch to webmail usage.
- SxVMemLimit = 384 from SOGoDefaults.plist, no entry in sogo.conf
- Don't know about zipping, many of those requests were first time
webmail logins, though (100+, cca 60% of users).
- Don't know about big tmp at the time, there are cca 9 now 51Kb - 387Kb

Problem seems to be gone now, no more frozen processes today. For last
two days there were some occurrences, seems like one for each user, who
got "No child available..." before I increased the number of workers.

For the lack of things to try (and out of desperation) I kept killing
those memory-greedy processes. There were none so far (for 16 hours
now). Command used (for your amusement):

tail -f /var/log/sogo/sogo.log |
grep --line-buffered  "1 minute" |
sed -u 's/^.*pid\ //' |
sed -u 's/\ .*//' |
while read proces;
do kill -9 $proces; echo -n "Delikvent: "$proces" " ; date ;
done

Should I be worried it will happen again or was it because of the new
users rush?

Thank you


Josef
-- 
[email protected]
https://inverse.ca/sogo/lists

Reply via email to