Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-02-03 Thread Greg Kunyavsky

On 17-01-26 07:26 PM, Ludovic Marcotte (lmarco...@inverse.ca) wrote:


On 2017-01-26 6:33 PM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote:

I followed the FAQ to install -dbg packages and run GDB.  I only had 
to change -WOUseWatchDog to YES, otherwise only 1 worker runs instead 
of 600.  Unfortunately the stack trace does not look very useful.  
Any suggestions? 

Install all debugging symbols. (ie., -dbg packages)


I think I did install the recommended debug symbols.  Am I missing any?

root@mail01:~# dpkg -l | grep -e "-dbg"
ii  libc6-dbg:amd64 2.23-0ubuntu5 amd64GNU C 
Library: detached debugging symbols
ii  libgcc1-dbg:amd64 1:6.0.1-0ubuntu1  amd64GCC 
support library (debug symbols)
ii  libgnustep-base1.24-dbg 1.24.7-1build2
amd64GNUstep Base library - debugging symbols
ii  libobjc4-dbg:amd64 5.4.0-6ubuntu1~16.04.4amd64
Runtime library for GNU Objective-C applications (debug symbols)
ii  sogo-dbg 3.2.5.20170123-1  amd64a modern and 
scalable groupware - debugging symbols
ii  sope4.9-dbg 4.9.r1664.20170114amd64Debugging 
files


--
users@sogo.nu
https://inverse.ca/sogo/lists


Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-01-26 Thread Ludovic Marcotte

On 2017-01-26 6:33 PM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote:

I followed the FAQ to install -dbg packages and run GDB.  I only had 
to change -WOUseWatchDog to YES, otherwise only 1 worker runs instead 
of 600.  Unfortunately the stack trace does not look very useful.  Any 
suggestions? 

Install all debugging symbols. (ie., -dbg packages)

--
Ludovic Marcotte
lmarco...@inverse.ca  ::  +1.514.755.3630  ::  http://inverse.ca
Inverse inc. :: Leaders behind SOGo (http://sogo.nu), PacketFence 
(http://packetfence.org) and Fingerbank (http://fingerbank.org)

--
users@sogo.nu
https://inverse.ca/sogo/lists


Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-01-26 Thread Greg Kunyavsky

On 17-01-24 07:14 AM, Ludovic Marcotte (lmarco...@inverse.ca) wrote:


On 2017-01-24 1:42 AM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote:

I am no closer to finding the issue, but I am getting this message 
pretty consistently


*** stack smashing detected ***: /usr/sbin/sogod terminated

But no core file. 
Try starting SOGo from gdb - so don't reduce the number of workers, 
just start it from gdb (ie., the parent process).


I followed the FAQ to install -dbg packages and run GDB.  I only had to 
change -WOUseWatchDog to YES, otherwise only 1 worker runs instead of 
600.  Unfortunately the stack trace does not look very useful.  Any 
suggestions?


(gdb) r
Starting program: /usr/sbin/sogod -WOUseWatchDog YES -WONoDetach YES 
-WOPort 2 -WOWorkersCount 600 -WOLogFile 
/var/log/sogo/sogo-crash.log -WOPidFile /tmp/sogo.pid

^CQuit
(gdb) b [NSException raise]
Function "[NSException raise]" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 ([NSException raise]) pending.
(gdb) b abort
Breakpoint 2 at 0xa7e0
(gdb) c
Continuing.
*** stack smashing detected ***: /usr/sbin/sogod terminated

Program received signal SIGABRT, Aborted.
0x74448428 in ?? ()
(gdb) bt
#0  0x74448428 in ?? ()
#1  0x7444a02a in ?? ()
#2  0x0020 in ?? ()
#3  0x in ?? ()
(gdb) bt full
#0  0x74448428 in ?? ()
No symbol table info available.
#1  0x7444a02a in ?? ()
No symbol table info available.
#2  0x0020 in ?? ()
No symbol table info available.
#3  0x in ?? ()
No symbol table info available.
(gdb)

--
users@sogo.nu
https://inverse.ca/sogo/lists


Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-01-24 Thread Ludovic Marcotte

On 2017-01-24 1:42 AM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote:

I am no closer to finding the issue, but I am getting this message 
pretty consistently


*** stack smashing detected ***: /usr/sbin/sogod terminated

But no core file. 
Try starting SOGo from gdb - so don't reduce the number of workers, just 
start it from gdb (ie., the parent process).


--
Ludovic Marcotte
lmarco...@inverse.ca  ::  +1.514.755.3630  ::  http://inverse.ca
Inverse inc. :: Leaders behind SOGo (http://sogo.nu), PacketFence 
(http://packetfence.org) and Fingerbank (http://fingerbank.org)

--
users@sogo.nu
https://inverse.ca/sogo/lists


Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-01-24 Thread Greg Kunyavsky
I am no closer to finding the issue, but I am getting this message 
pretty consistently


*** stack smashing detected ***: /usr/sbin/sogod terminated

But no core file.  How do I get a core dump?

I tried setting ulimit and /proc/sys/kernel/core_pattern

root@mail01:~#ulimit -c unlimited

root@mail01:~#echo /var/log/sogocore > /proc/sys/kernel/core_pattern

root@mail01:~# ulimit -Sc
unlimited
root@mail01:~# su sogo
sogo@mail01:/root$ ulimit -Sc
unlimited


--
users@sogo.nu
https://inverse.ca/sogo/lists


Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-01-09 Thread Greg Kunyavsky

On 17-01-09 04:13 PM, Ludovic Marcotte (lmarco...@inverse.ca) wrote:

Check of your "sogo" user isn't running out of file descriptors for 
the master sogod process. Do a "su - sogo" and run "ulimit -n".
I did have errors earlier that hinted at running out of file 
descriptors, but I thought I solved it.

root@mail01:~# ulimit -Sn 65000
root@mail01:~# su - sogo
sogo@mail01:~$ ulimit -n
823000
sogo@mail01:~$ ulimit -Sn
823000
sogo@mail01:~$ ulimit -Hn
823000

Btw, I am testing with version 3.2.4 (build @shiva.inverse 201612290757)
I will try running with the latest.

Thanks,
Greg
--
users@sogo.nu
https://inverse.ca/sogo/lists


Re: [SOGo] At WOWorkersCount 600 keep consuming CPU and RAM with no clients connected.

2017-01-09 Thread Ludovic Marcotte

On 2017-01-09 3:45 PM, Greg Kunyavsky (gr...@kgbconsulting.ca) wrote:

We are hoping to support several thousand push email users using 
sogo.  I am trying to push the worker count higher. Currently I am not 
testing any clients yet, just trying to get sogo to start with a large 
number of worker processes and sit idle.  At 400 processes, they take 
27MB (RES) each and consume no CPU (with no clients connected).  
That's 10GB RAM used out of 80GB total.  Everything is great.
Somewhere above 400, there is a problem.If I run 600, they keep 
growing and all keep consuming CPU.  The log looks clean to me,  all 
processes print "notified the watchdog that we are ready" and then no 
more messages. I set all debug options to yes in the conf file and 
still nothing.  Zero messages in mysql error log. Load average goes 
above 300 and within 1min all memory and swap are consumed and the 
server becomes unresponsive.


Is there any way to turn on more debugging?  Any other suggestions?
Check of your "sogo" user isn't running out of file descriptors for the 
master sogod process. Do a "su - sogo" and run "ulimit -n".
What average memory usage per worker I should budget for ?  (27MB < ? 
< 325MB)
It depends on your SxVMemLimit value. I've just pushed in commits 
841fdb96cc7b30804d0f9917af4a58a19bf8091c and 
5e775ea4ceec6c64c3f7e8dc345c1be91e2e22e4 a better output upon startup:


Jan 09 22:08:05 sogod [31158]: version 3.2.4 (build 
r...@sogo.example.com 201701091607) -- starting
Jan 09 22:08:05 sogod [31158]: vmem size check enabled: shutting down 
app when vmem > 384 MB. *Currently at 182 MB*
Jan 09 22:08:05 sogod [31158]: <0x0x55a153a0[SOGoProductLoader]> 
SOGo products loaded from '/usr/local/lib/GNUstep/SOGo':
Jan 09 22:08:05 sogod [31158]: <0x0x55a153a0[SOGoProductLoader]>   
MainUI.SOGo, Appointments.SOGo, Contacts.SOGo, Mailer.SOGo, 
ActiveSync.SOGo, MailPartViewers.SOGo, ContactsUI.SOGo, CommonUI.SOGo, 
PreferencesUI.SOGo, MailerUI.SOGo, SchedulerUI.SOGo, AdministrationUI.SOGo
*Jan 09 22:08:05 sogod [31158]: All products loaded - current memory 
usage at 238 MB*
Jan 09 22:08:05 sogod [31158]: |SOGo| WOHttpAdaptor listening on address 
*:2


So at 238 MB with a SxVMemLimit set at SxVMemLimit, each sogod work can 
consume about 146MB of RAM before getting restarted. Of course, some can 
consume more than that for a few seconds (while generating huge 
responses for example).


Thanks,

--
Ludovic Marcotte
lmarco...@inverse.ca  ::  +1.514.755.3630  ::  http://inverse.ca
Inverse inc. :: Leaders behind SOGo (http://sogo.nu), PacketFence 
(http://packetfence.org) and Fingerbank (http://fingerbank.org)

--
users@sogo.nu
https://inverse.ca/sogo/lists