Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
On 17-01-26 07:26 PM, Ludovic Marcotte (lmarco...@inverse.ca) wrote: On 2017-01-26 6:33 PM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote: I followed the FAQ to install -dbg packages and run GDB. I only had to change -WOUseWatchDog to YES, otherwise only 1 worker runs instead of 600. Unfortunately the stack trace does not look very useful. Any suggestions? Install all debugging symbols. (ie., -dbg packages) I think I did install the recommended debug symbols. Am I missing any? root@mail01:~# dpkg -l | grep -e "-dbg" ii libc6-dbg:amd64 2.23-0ubuntu5 amd64GNU C Library: detached debugging symbols ii libgcc1-dbg:amd64 1:6.0.1-0ubuntu1 amd64GCC support library (debug symbols) ii libgnustep-base1.24-dbg 1.24.7-1build2 amd64GNUstep Base library - debugging symbols ii libobjc4-dbg:amd64 5.4.0-6ubuntu1~16.04.4amd64 Runtime library for GNU Objective-C applications (debug symbols) ii sogo-dbg 3.2.5.20170123-1 amd64a modern and scalable groupware - debugging symbols ii sope4.9-dbg 4.9.r1664.20170114amd64Debugging files -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
On 2017-01-26 6:33 PM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote: I followed the FAQ to install -dbg packages and run GDB. I only had to change -WOUseWatchDog to YES, otherwise only 1 worker runs instead of 600. Unfortunately the stack trace does not look very useful. Any suggestions? Install all debugging symbols. (ie., -dbg packages) -- Ludovic Marcotte lmarco...@inverse.ca :: +1.514.755.3630 :: http://inverse.ca Inverse inc. :: Leaders behind SOGo (http://sogo.nu), PacketFence (http://packetfence.org) and Fingerbank (http://fingerbank.org) -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
On 17-01-24 07:14 AM, Ludovic Marcotte (lmarco...@inverse.ca) wrote: On 2017-01-24 1:42 AM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote: I am no closer to finding the issue, but I am getting this message pretty consistently *** stack smashing detected ***: /usr/sbin/sogod terminated But no core file. Try starting SOGo from gdb - so don't reduce the number of workers, just start it from gdb (ie., the parent process). I followed the FAQ to install -dbg packages and run GDB. I only had to change -WOUseWatchDog to YES, otherwise only 1 worker runs instead of 600. Unfortunately the stack trace does not look very useful. Any suggestions? (gdb) r Starting program: /usr/sbin/sogod -WOUseWatchDog YES -WONoDetach YES -WOPort 2 -WOWorkersCount 600 -WOLogFile /var/log/sogo/sogo-crash.log -WOPidFile /tmp/sogo.pid ^CQuit (gdb) b [NSException raise] Function "[NSException raise]" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 ([NSException raise]) pending. (gdb) b abort Breakpoint 2 at 0xa7e0 (gdb) c Continuing. *** stack smashing detected ***: /usr/sbin/sogod terminated Program received signal SIGABRT, Aborted. 0x74448428 in ?? () (gdb) bt #0 0x74448428 in ?? () #1 0x7444a02a in ?? () #2 0x0020 in ?? () #3 0x in ?? () (gdb) bt full #0 0x74448428 in ?? () No symbol table info available. #1 0x7444a02a in ?? () No symbol table info available. #2 0x0020 in ?? () No symbol table info available. #3 0x in ?? () No symbol table info available. (gdb) -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
On 2017-01-24 1:42 AM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote: I am no closer to finding the issue, but I am getting this message pretty consistently *** stack smashing detected ***: /usr/sbin/sogod terminated But no core file. Try starting SOGo from gdb - so don't reduce the number of workers, just start it from gdb (ie., the parent process). -- Ludovic Marcotte lmarco...@inverse.ca :: +1.514.755.3630 :: http://inverse.ca Inverse inc. :: Leaders behind SOGo (http://sogo.nu), PacketFence (http://packetfence.org) and Fingerbank (http://fingerbank.org) -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
I am no closer to finding the issue, but I am getting this message pretty consistently *** stack smashing detected ***: /usr/sbin/sogod terminated But no core file. How do I get a core dump? I tried setting ulimit and /proc/sys/kernel/core_pattern root@mail01:~#ulimit -c unlimited root@mail01:~#echo /var/log/sogocore > /proc/sys/kernel/core_pattern root@mail01:~# ulimit -Sc unlimited root@mail01:~# su sogo sogo@mail01:/root$ ulimit -Sc unlimited -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
On 17-01-09 04:13 PM, Ludovic Marcotte (lmarco...@inverse.ca) wrote: Check of your "sogo" user isn't running out of file descriptors for the master sogod process. Do a "su - sogo" and run "ulimit -n". I did have errors earlier that hinted at running out of file descriptors, but I thought I solved it. root@mail01:~# ulimit -Sn 65000 root@mail01:~# su - sogo sogo@mail01:~$ ulimit -n 823000 sogo@mail01:~$ ulimit -Sn 823000 sogo@mail01:~$ ulimit -Hn 823000 Btw, I am testing with version 3.2.4 (build @shiva.inverse 201612290757) I will try running with the latest. Thanks, Greg -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.
On 2017-01-09 3:45 PM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote: We are hoping to support several thousand push email users using sogo. I am trying to push the worker count higher. Currently I am not testing any clients yet, just trying to get sogo to start with a large number of worker processes and sit idle. At 400 processes, they take 27MB (RES) each and consume no CPU (with no clients connected). That's 10GB RAM used out of 80GB total. Everything is great. Somewhere above 400, there is a problem.If I run 600, they keep growing and all keep consuming CPU. The log looks clean to me, all processes print "notified the watchdog that we are ready" and then no more messages. I set all debug options to yes in the conf file and still nothing. Zero messages in mysql error log. Load average goes above 300 and within 1min all memory and swap are consumed and the server becomes unresponsive. Is there any way to turn on more debugging? Any other suggestions? Check of your "sogo" user isn't running out of file descriptors for the master sogod process. Do a "su - sogo" and run "ulimit -n". What average memory usage per worker I should budget for ? (27MB < ? < 325MB) It depends on your SxVMemLimit value. I've just pushed in commits 841fdb96cc7b30804d0f9917af4a58a19bf8091c and 5e775ea4ceec6c64c3f7e8dc345c1be91e2e22e4 a better output upon startup: Jan 09 22:08:05 sogod [31158]: version 3.2.4 (build r...@sogo.example.com 201701091607) -- starting Jan 09 22:08:05 sogod [31158]: vmem size check enabled: shutting down app when vmem > 384 MB. *Currently at 182 MB* Jan 09 22:08:05 sogod [31158]: <0x0x55a153a0[SOGoProductLoader]> SOGo products loaded from '/usr/local/lib/GNUstep/SOGo': Jan 09 22:08:05 sogod [31158]: <0x0x55a153a0[SOGoProductLoader]> MainUI.SOGo, Appointments.SOGo, Contacts.SOGo, Mailer.SOGo, ActiveSync.SOGo, MailPartViewers.SOGo, ContactsUI.SOGo, CommonUI.SOGo, PreferencesUI.SOGo, MailerUI.SOGo, SchedulerUI.SOGo, AdministrationUI.SOGo *Jan 09 22:08:05 sogod [31158]: All products loaded - current memory usage at 238 MB* Jan 09 22:08:05 sogod [31158]: |SOGo| WOHttpAdaptor listening on address *:2 So at 238 MB with a SxVMemLimit set at SxVMemLimit, each sogod work can consume about 146MB of RAM before getting restarted. Of course, some can consume more than that for a few seconds (while generating huge responses for example). Thanks, -- Ludovic Marcotte lmarco...@inverse.ca :: +1.514.755.3630 :: http://inverse.ca Inverse inc. :: Leaders behind SOGo (http://sogo.nu), PacketFence (http://packetfence.org) and Fingerbank (http://fingerbank.org) -- users@sogo.nu https://inverse.ca/sogo/lists