Re: [SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
On 13-08-27 7:30 PM, Hannes Erven wrote: * i still got segfaults when starting from the init script, while it ran fine from the sogo-user's command prompt and via gdb. Root cause: the init script created /var/run/sogod.pid, but sogo expected it to be /var/run/sogo/sogod.pid . Hi Hannes, glad you got this sorted out. I just tested this and sogod does segfault if it cannot write its pidfile... I'll fix that. But you should have seen an error message in sogo.log: 2013-08-28 09:16:24.469 sogod[2359] File NSData.m: 1425. In -[NSData writeToFile:options:error:] Open (/nonexistent/test/sogo.pid) failed - No such file or directory Did this somehow get lost on gentoo? -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
On 13-08-28 9:17 AM, Jean Raby wrote: On 13-08-27 7:30 PM, Hannes Erven wrote: * i still got segfaults when starting from the init script, while it ran fine from the sogo-user's command prompt and via gdb. Root cause: the init script created /var/run/sogod.pid, but sogo expected it to be /var/run/sogo/sogod.pid . Hi Hannes, glad you got this sorted out. I just tested this and sogod does segfault if it cannot write its pidfile... I'll fix that. Depressing fix ;-) https://github.com/inverse-inc/sope/commit/3dcc9b1459bd8a3ab626f918dea5576defb86eea -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
Hi Jean, I just tested this and sogod does segfault if it cannot write its pidfile... I'll fix that. But you should have seen an error message in sogo.log: 2013-08-28 09:16:24.469 sogod[2359] File NSData.m: 1425. In -[NSData writeToFile:options:error:] Open (/nonexistent/test/sogo.pid) failed - No such file or directory Did this somehow get lost on gentoo? It did show up in the logfile, hence it was quite easy to fix once spotted. However, as you agree, it is quite surprising that this results in a segfault and it did confuse me -- having dealt with segfaults without any trace in a logfile for days, that wasn't one of the first places I looked... Thanks again, best regards -hannes -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
On 13-08-26 6:34 PM, Hannes Erven wrote: Hi Jean, thank you for your help and sorry for not strictly following the debug guide in the first place :-/ Here's the new GDB transcript - unfortunately it is quite the same: $ gdb --args /usr/sbin/sogod -WOUseWatchDog NO -WONoDetach YES -WOPort 2 -WOWorkersCount 1 -WOLogFile - -WOPidFile /tmp/sogo.pid GNU gdb (Gentoo 7.5.1 p2) 7.5.1 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as i686-pc-linux-gnu. For bug reporting instructions, please see: http://bugs.gentoo.org/... Reading symbols from /usr/sbin/sogod...(no debugging symbols found)...done. (gdb) b [NSException raise] Function [NSException raise] not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 ([NSException raise]) pending. (gdb) b abort Breakpoint 2 at 0x2af0 (gdb) r Starting program: /usr/sbin/sogod -WOUseWatchDog NO -WONoDetach YES -WOPort 2 -WOWorkersCount 1 -WOLogFile - -WOPidFile /tmp/sogo.pid warning: Could not load shared library symbols for linux-gate.so.1. Do you need set solib-search-path or set sysroot? [Thread debugging using libthread_db enabled] Using host libthread_db library /lib/libthread_db.so.1. recursion encountered handling uncaught exception Breakpoint 2, 0xb7286ad5 in abort () from /lib/libc.so.6 (gdb) bt #0 0xb7286ad5 in abort () from /lib/libc.so.6 #1 0xb759ba14 in ?? () from /usr/lib/libgnustep-base.so.1.24 #2 0xb740bde8 in objc_msg_lookup () from /usr/lib/gcc/i686-pc-linux-gnu/4.5.4/libobjc.so.2 #3 0x80002ccc in main () (gdb) This time there was no entry in dmesg. These binaries were built using the gentoo ebuilds, so perhaps the debug wasn't correctly honoured? Shall I manually rebuild the sources from scratch to improve chances to a better stacktrace? We may get a better trace if you had debugging symbols for gnustep-base, but it would be even better if you could build sope + sogo with debugging enabled[1], since we could see exactly where it breaks. Since your trace shows that it abort() from main(), you could simply step through the instructions with gdb and see where it crashes: (gdb) b main (gdb) run (gdb) n--- do this for each instruction until it crashes Then post the results here :-) [1] http://www.sogo.nu/english/nc/support/faq/article/how-do-i-compile-sogo-2.html Thanks again, best regards -hannes -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
Hi again, first of all: my issue has now been solved, thanks again very much for your efforts! What did I do: * emerge --unmerge gnustep-make gnustep-base sope sogo (removed everything with the Gentoo package manager. Interestingly, there were leftovers in /usr/local/ that I now removed manually. Previously, I didn't unmerge gnustep-make and hence didn't notice the leftovers.) * downloaded and make installed all the sources again * edited the apache config to reflect the new paths * i still got segfaults when starting from the init script, while it ran fine from the sogo-user's command prompt and via gdb. Root cause: the init script created /var/run/sogod.pid, but sogo expected it to be /var/run/sogo/sogod.pid . In the manual startup I pointed the pidfile to /tmp/xxx.pid which obviously worked around that. What I learned: * don't mix distro-provided packages with manual installs, or if you do and strange things happen, check again as you most probably did miss something. I hope this will save some other people's time when they run into segfaults... Best regards, -hannes -- users@sogo.nu https://inverse.ca/sogo/lists
Re: [SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
On 13-08-25 9:00 PM, Hannes Erven wrote: Program received signal SIGABRT, Aborted. 0xe424 in __kernel_vsyscall () (gdb) bt #0 0xe424 in __kernel_vsyscall () #1 0xb72856d9 in raise () from /lib/libc.so.6 #2 0xb7286c53 in abort () from /lib/libc.so.6 #3 0xb759ba14 in ?? () from /usr/lib/libgnustep-base.so.1.24 #4 0xb740bde8 in objc_msg_lookup () from /usr/lib/gcc/i686-pc-linux-gnu/4.5.4/libobjc.so.2 #5 0x80002ccc in main () I then tried manually compiling the latest sogo/sope 2.0.7 from the tarballs and gnustep-base 1.24.5 , which resulted again in a different but still a segfault on startup: #1 0xb79f615f in GSToUnicode (no more details from gdb) And in dmesg: [kernel] sogod[1782]: segfault at bf2f0e9c ip b717e15f sp bf2f0ea0 error 6 in libgnustep-base.so.1.24.5[b6ec1000+3a2000] This is strange. Can you try starting GDB/sogo with the following arguments: gdb --args /usr/sbin/sogod -WOUseWatchDog NO -WONoDetach YES -WOPort 2 -WOWorkersCount 1 -WOLogFile - -WOPidFile /tmp/sogo.pid and then set breakpoints on [NSException raise] and 'abort' (see http://www.sogo.nu/english/nc/support/faq/article/how-do-i-debug-sogo.html) Hopefully we'll get a better backtrace with this. -- users@sogo.nu https://inverse.ca/sogo/lists
[SOGo] Segfault on Gentoo / 2.0.3 and 2.0.7
Hi, I'm not really sure what exactly changed on my Gentoo (i686/32bit) system, but after a reboot the SOGOd wouldn't start anymore, just reporting a segmentation fault circa 10 seconds after starting. The previous reboot of that machine was just two weeks ago and no interesting emerges in between (just iotop and minidlna). In dmesg there is: sogod[4992]: segfault at 7261764f ip b7076b70 sp bfecc5f0 error 4 in libobjc.so.2.0.0[b7067000+17000] I've then reinstalled gnustep-make (2.6.2), gnustep-base (1.24.0-r1), sope (2.0.3a) and sogo (2.0.3a) from the dragon overlay, all with LDFLAGS= and debug enabled. Here's a gdb session with that: $ gdb --args /usr/sbin/sogod -WONoDetach YES GNU gdb (Gentoo 7.5.1 p2) 7.5.1 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as i686-pc-linux-gnu. For bug reporting instructions, please see: http://bugs.gentoo.org/... Reading symbols from /usr/sbin/sogod...(no debugging symbols found)...done. (gdb) r Starting program: /usr/sbin/sogod -WONoDetach YES warning: Could not load shared library symbols for linux-gate.so.1. Do you need set solib-search-path or set sysroot? [Thread debugging using libthread_db enabled] Using host libthread_db library /lib/libthread_db.so.1. recursion encountered handling uncaught exception Program received signal SIGABRT, Aborted. 0xe424 in __kernel_vsyscall () (gdb) bt #0 0xe424 in __kernel_vsyscall () #1 0xb72856d9 in raise () from /lib/libc.so.6 #2 0xb7286c53 in abort () from /lib/libc.so.6 #3 0xb759ba14 in ?? () from /usr/lib/libgnustep-base.so.1.24 #4 0xb740bde8 in objc_msg_lookup () from /usr/lib/gcc/i686-pc-linux-gnu/4.5.4/libobjc.so.2 #5 0x80002ccc in main () I then tried manually compiling the latest sogo/sope 2.0.7 from the tarballs and gnustep-base 1.24.5 , which resulted again in a different but still a segfault on startup: #1 0xb79f615f in GSToUnicode (no more details from gdb) And in dmesg: [kernel] sogod[1782]: segfault at bf2f0e9c ip b717e15f sp bf2f0ea0 error 6 in libgnustep-base.so.1.24.5[b6ec1000+3a2000] I'm really lost at this point and have no idea what to further analyze :-/ Any suggestions what to do next are very welcome, thank you! Best regards, -hannes -- users@sogo.nu https://inverse.ca/sogo/lists