Re: [SOGo] sogo segfaults, and then: systemd: "Found left-over process xxxx (sogod) in control group while starting unit. Ignoring" -> sogo down

2020-11-15 Thread mj

Hi,

Since the below happened at logrotation-time, I checked that out that a 
bit more carefully of the server, and discovered a difference between 
stretch <-> buster logrotate config.


As I manually edited our /etc/logrotate.d/rsyslog, the dist-upgrade had 
not replaced our file. See 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900586


I hope this has an effect on the observed SOGo behaviour.

I will update this next week sunday, to let the archives know if it 
helped. Still strange that it results segfaults in sogo.


Regards,

MJ

On 11/15/20 11:43 AM, mj (li...@merit.unu.edu) wrote:

Hi,

We are running sogo (5.0.1-1) on an upgraded stretch -> buster debian
x64. We've noticed serieus looking issues on two saterday nights in a 
row, when cron restarts sogo (I guess after log rotation)


It starts off with some segfaults:

Nov 15 00:00:27 server kernel: [552373.802573] sogod[23629]: segfault 
at 7ffe84ab0583 ip 7f9e51069201 sp 7ffe04ab0488 error 4 in 
libc-2.28.so[7f9e50fe9000+148000]
Nov 15 00:00:27 server kernel: [552373.802587] Code: f0 0f 10 5c 16 e0 
0f 11 07 0f 11 4f 10 0f 11 54 17 f0 0f 11 5c 17 e0 c3 48 39 f7 0f 87 
8c 00 00 00 0f 84 28 ff ff ff 0f 10 26 <0f> 10 6c 16 f0 0f 10 74 16 e0 
0f 10 7c 16 d0 44 0f 10 44 16 c0 49


next sogo is stopped:


Nov 15 00:00:27 server sogo[2019]: Stopping SOGo: sogo.
Nov 15 00:00:27 server systemd[1]: sogo.service: Succeeded.
Nov 15 00:00:27 server systemd[1]: Stopped LSB: SOGo server.
Nov 15 00:00:27 server systemd[1]: sogo.service: Found left-over 
process 23446 (sogod) in control group while starting unit. Ignoring.
Nov 15 00:00:27 server systemd[1]: This usually indicates unclean 
termination of a previous run, or service implementation deficiencies.


and the sogo restart:

Nov 15 00:00:27 server systemd[1]: Starting LSB: SOGo server...


However this last start fails, and SOGo is no longer running after the 
above.


Full syslogs logs are here:
https://pastebin.com/344eNHAx

I think the sogo.service that is used is a generated one, here:


root@server:/proc/1# cat /run/systemd/generator.late/sogo.service
# Automatically generated by systemd-sysv-generator

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/init.d/sogo
Description=LSB: SOGo server
Before=multi-user.target
Before=multi-user.target
Before=multi-user.target
Before=graphical.target
After=remote-fs.target
After=network-online.target
Wants=network-online.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
SuccessExitStatus=5 6
ExecStart=/etc/init.d/sogo start
ExecStop=/etc/init.d/sogo stop


Anyone with an idea?

MJ

--
users@sogo.nu
https://inverse.ca/sogo/lists


[SOGo] sogo segfaults, and then: systemd: "Found left-over process xxxx (sogod) in control group while starting unit. Ignoring" -> sogo down

2020-11-15 Thread mj

Hi,

We are running sogo (5.0.1-1) on an upgraded stretch -> buster debian
x64. We've noticed serieus looking issues on two saterday nights in a 
row, when cron restarts sogo (I guess after log rotation)


It starts off with some segfaults:


Nov 15 00:00:27 server kernel: [552373.802573] sogod[23629]: segfault at 
7ffe84ab0583 ip 7f9e51069201 sp 7ffe04ab0488 error 4 in 
libc-2.28.so[7f9e50fe9000+148000]
Nov 15 00:00:27 server kernel: [552373.802587] Code: f0 0f 10 5c 16 e0 0f 11 07 0f 11 
4f 10 0f 11 54 17 f0 0f 11 5c 17 e0 c3 48 39 f7 0f 87 8c 00 00 00 0f 84 28 ff ff ff 
0f 10 26 <0f> 10 6c 16 f0 0f 10 74 16 e0 0f 10 7c 16 d0 44 0f 10 44 16 c0 49


next sogo is stopped:


Nov 15 00:00:27 server sogo[2019]: Stopping SOGo: sogo.
Nov 15 00:00:27 server systemd[1]: sogo.service: Succeeded.
Nov 15 00:00:27 server systemd[1]: Stopped LSB: SOGo server.
Nov 15 00:00:27 server systemd[1]: sogo.service: Found left-over process 23446 
(sogod) in control group while starting unit. Ignoring.
Nov 15 00:00:27 server systemd[1]: This usually indicates unclean termination 
of a previous run, or service implementation deficiencies.


and the sogo restart:

Nov 15 00:00:27 server systemd[1]: Starting LSB: SOGo server...


However this last start fails, and SOGo is no longer running after the 
above.


Full syslogs logs are here:
https://pastebin.com/344eNHAx

I think the sogo.service that is used is a generated one, here:


root@server:/proc/1# cat /run/systemd/generator.late/sogo.service
# Automatically generated by systemd-sysv-generator

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/init.d/sogo
Description=LSB: SOGo server
Before=multi-user.target
Before=multi-user.target
Before=multi-user.target
Before=graphical.target
After=remote-fs.target
After=network-online.target
Wants=network-online.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
SuccessExitStatus=5 6
ExecStart=/etc/init.d/sogo start
ExecStop=/etc/init.d/sogo stop


Anyone with an idea?

MJ
--
users@sogo.nu
https://inverse.ca/sogo/lists