Am 17.09.2015 um 15:04 schrieb Josef Škurek: > > Dne 17.9.2015 v 14:16 Christian Mack napsal(a): >> Hello >> >> Am 14.09.2015 um 11:38 schrieb Josef Škurek: >>> >>> One of /usr/sbin/sogod workers filing up WHOLE memory. >>> >>> in /var/log/sogo/sogo.log that process PID logs >>> Sep 14 11:07:34 sogod [8883]: 192.168.106.2 "GET /SOGo/ HTTP/1.1" 200 >>> 4281/0 0.036 11290 62% -748K >>> Sep 14 11:07:39 sogod [8883]: <0x0x7ff663705cf8[NGImap4Client]> Note: no >>> key found for sorting, using 'DATE': (null) >>> Sep 14 11:08:38 sogod [5475]: [WARN] <0x0x7ff662ed87b8[WOWatchDogChild]> >>> pid 8883 has been hanging in the same request for 1 minutes >>> Sep 14 11:09:38 sogod [5475]: [WARN] <0x0x7ff662ed87b8[WOWatchDogChild]> >>> pid 8883 has been hanging in the same request for 2 minutes >>> Sep 14 11:10:38 sogod [5475]: [WARN] >>> >>> and the warning about hangup repeats on and on. I always end up killing >>> the process, but another pops up with the same problem, always with the >>> message >>> >>> Note: no key found for sorting, using 'DATE': (null) >>> >>> preceding the process freeze(insanity?), but so far for different users >>> connecting through web interface >>> >>> google-fu found >>> https://forum.zentyal.org/index.php?topic=23256.0 >>> indicating it is somehow connected to user settings? >>> >>> Any ideas besides constant kill? >>> >> >> How many RAM do you have? >> How many workers do you have? >> What did you set for SxVMemLimit? >> Do you have users zipping their mailboxes when your RAM is filling up? >> Do you have big /tmp/OGo* files who do not vanish after some minutes? >> >> >> Kind regards, >> Christian Mack >> > > Hello, Christian. > > To answer your questions: > - Gen2 Hyper-V guest, tried more RAM, at 12GB now. > - Workers were 3, at 20 now. Problem seemed related to workers, a lot of > [WOWatchDog]> No child available to handle incoming request! > after en masse switch to webmail usage. > - SxVMemLimit = 384 from SOGoDefaults.plist, no entry in sogo.conf > - Don't know about zipping, many of those requests were first time > webmail logins, though (100+, cca 60% of users). > - Don't know about big tmp at the time, there are cca 9 now 51Kb - 387Kb > > Problem seems to be gone now, no more frozen processes today. For last > two days there were some occurrences, seems like one for each user, who > got "No child available..." before I increased the number of workers. > > For the lack of things to try (and out of desperation) I kept killing > those memory-greedy processes. There were none so far (for 16 hours > now). Command used (for your amusement): > > tail -f /var/log/sogo/sogo.log | > grep --line-buffered "1 minute" | > sed -u 's/^.*pid\ //' | > sed -u 's/\ .*//' | > while read proces; > do kill -9 $proces; echo -n "Delikvent: "$proces" " ; date ; > done > > Should I be worried it will happen again or was it because of the new > users rush? >
It happend, because you didn't have enough workers for the rush. For such rush traffic you also could increase WOListenQueueSize. This tells sogod to queue up that count of requests for its workers. This will not decrease your workload, but clients are not rejected instantly, when no worker is available. You do not need to kill those workers by yourself. Instead you can decrease WOWatchDogRequestTimeout (in minutes) which is set to 10 minutes by default. Then the SOGo WatchDog will kill those processes for you. Your RAM should be approx.: worker count * SxVMemLimit (which is in MB) In your case 20 * 384 MB = 7680 MB (~ 8GB) With 20 workers you should be able to handle all 200 users, as long as they use web interface and/or calDAV/cardDAV. If you have ActiveSync users, you have to reserve one worker per ActiveSync connection! Kind regards, Christian Mack -- Christian Mack Universität Konstanz Kommunikations-, Informations-, Medienzentrum (KIM) Abteilung Basisdienste 78457 Konstanz +49 7531 88-4416
smime.p7s
Description: S/MIME Cryptographic Signature
