I recently upgrade from 1.3.12 to 1.3.15 on debian stable (with no problems during the upgrade). About 20 hours laser, SOGo crashed, leaving a dead process that I had to manually kill (-9) the sogod process and restart sogo. The logs show:

localhost - - [18/May/2012:08:50:41 GMT] "GET /SOGo/so/user/Mail/0/folderDrafts/newDraft1337352640-1//edit HTTP/1.1" 200 357811/0 0.728 - - 0 localhost - - [18/May/2012:08:50:49 GMT] "POST /SOGo/so/user/Mail//0/folderINBOX/uids HTTP/1.1" 200 3441/44 0.269 - - 0 localhost - - [18/May/2012:08:50:49 GMT] "POST /SOGo/so/user/Mail//0/folderINBOX/unseenCount HTTP/1.1" 200 13/0 0.246 - - 0 May 18 08:52:02 sogod [26185]: <<0x01CF07C0[SOGoContactGCSFolder]:personal>>D fetched 1243 records. 2012-05-18 08:52:02.647 sogod[26185] WARNING(-[NSNull(misc) length]): called NSNull -length (returns 0) !!! C: connect to <0x0x2809ba0[NGInternetSocketAddress]: host=localhost not-filled>
S: <SMTP-Reply: code=220 line='mail.example.com ESMTP Postfix (Debian/GNU)'>
C: EHLO localhost
S: <SMTP-Reply: code=250 line='DSN'>
S: pipelining extension supported.
S: size extension supported.
C: MAIL FROM:<[email protected]>
C: RCPT TO:<[email protected]>
C: DATA
S: <SMTP-Reply: code=354 line='End data with <CR><LF>.<CR><LF>'>
C: data(45503 bytes) ..
S: <SMTP-Reply: code=250 line='2.0.0 Ok: queued as B242F49B2E'>
C: QUIT
S: <SMTP-Reply: code=221 line='2.0.0 Bye'>
May 18 08:53:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 1 minutes May 18 08:54:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 2 minutes May 18 08:55:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 3 minutes May 18 08:56:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 4 minutes May 18 08:57:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 5 minutes May 18 08:58:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 6 minutes May 18 08:59:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 7 minutes May 18 09:00:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 8 minutes May 18 09:01:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26185 has been hanging in the same request for 9 minutes May 18 09:02:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> safety belt -- sending KILL signal to pid 26185 May 18 09:02:02 sogod [26167]: <0x0x16409b0[WOWatchDogChild]> child 26185 exited May 18 09:02:02 sogod [26167]: <0x0x16409b0[WOWatchDogChild]> (terminated due to signal 9) May 18 09:02:02 sogod [26167]: <0x0x16ecda0[WOWatchDog]> child spawned with pid 25729 2012-05-18 09:02:02.265 sogod[25729] Note: Using UTF-8 as URL encoding in NGExtensions. May 18 09:02:02 sogod [25729]: <0x0x1914070[SOGoCache]> Cache cleanup interval set every 300.000000 seconds May 18 09:02:02 sogod [25729]: <0x0x1914070[SOGoCache]> Using host(s) 'localhost' as server(s) 2012-05-18 09:02:02.291 sogod[25729] Note(SoObject): SoDebugKeyLookup is enabled! 2012-05-18 09:02:02.291 sogod[25729] Note(SoObject): SoDebugBaseURL is enabled! 2012-05-18 09:02:02.291 sogod[25729] Note(SoObject): relative base URLs are enabled. 2012-05-18 09:02:02.296 sogod[25729] WARNING: IMAP4 connection pooling is disabled! 2012-05-18 09:02:02.333 sogod[25729] Note(NGImap4Connection): using '/' as the IMAP4 folder separator. May 18 09:03:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 25729 has been hanging in the same request for 1 minutes May 18 09:04:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 25729 has been hanging in the same request for 2 minutes

that loop of a 10 minute timeout and then sending the kill signal happened a few times until I tried to restart sogo with "/etc/init.d/sogo restart" which then gave me:


May 18 09:27:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26562 has been hanging in the same request for 5 minutes May 18 09:28:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26562 has been hanging in the same request for 6 minutes May 18 09:29:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26562 has been hanging in the same request for 7 minutes May 18 09:30:02 sogod [26167]: [WARN] <0x0x16409b0[WOWatchDogChild]> pid 26562 has been hanging in the same request for 8 minutes May 18 09:30:27 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:27 sogod [26167]: <0x0x16409b0[WOWatchDogChild]> sending terminate signal to pid 26562 May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM May 18 09:30:28 sogod [26167]: <0x0x16ecda0[WOWatchDog]> Terminating with SIGINT or SIGTERM


and that repeated until I killed sogod.

Could this be related to the sending email attachments bug in 1.3.15? Anything I can do to help debug what is going on? Any ideas why the watchdog child restart doesn't clear the problem? After I restarted sogo, everything is working for now...

Thanks,

Ben
--
[email protected]
https://inverse.ca/sogo/lists

Reply via email to