Hello Niko, On 15:41 Mon 23 Jun , Nikos Mavrogiannopoulos wrote: > On Mon, Jun 23, 2014 at 2:14 PM, Apollon Oikonomopoulos > <[email protected]> wrote: > > Hi, > > > > (Please Cc me on reply, I'm not subscribed to the list. Thanks.) > > I'm trying to debug a (rather painful) issue that apparently lies either in > > libcurl, or GnuTLS 3.2. It all started out with Debian bug #751886 [1], > > where the software at hand (Ganeti) uses Haskell's FFI to call libcurl, > > which > > in Debian is currently linked with GnuTLS 3.2. > [...] > > I managed to obtain a few meaningful backtraces from the segfaulting > > instances, > > all of them consistently happening during the handshake and all of them > > with the same call trace like the one at the bottom of this message. Also, > > debug output from a corrupted and a crashed handshake follow. > > Hello Apollon, > Could you get a valgrind trace of the failed instance (with crash or > not)? I suspect there is a memory corruption somewhere and valgrind > may be better than the debugger identifying it.
Right, here's the output on a run with segfault: ==7248== Memcheck, a memory error detector ==7248== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==7248== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==7248== Command: /usr/sbin/ganeti-luxid -f ==7248== 2014-06-23 17:26:32,671936000000 EEST: ganeti-luxid pid=7248 NOTICE ganeti-luxid daemon startup 2014-06-23 17:26:33,097049000000 EEST: ganeti-luxid pid=7248 INFO Loaded new config, serial 30 2014-06-23 17:26:33,115620000000 EEST: ganeti-luxid pid=7248 INFO Starting up in inotify mode ==7248== Conditional jump or move depends on uninitialised value(s) ==7248== at 0x6B0FF70: gnutls_session_get_data (in /usr/lib/x86_64-linux-gnu/libgnutls-deb0.so.28.30.6) ==7248== by 0x52FB939: gtls_connect_step3 (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0) ==7248== by 0x52FBE49: gtls_connect_common (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0) ==7248== by 0x52FC8DF: Curl_ssl_connect_nonblocking (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0) ==7248== by 0x52BB37D: https_connecting (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0) ==7248== by 0x52DDC6E: multi_runsingle (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0) ==7248== by 0x52DE7E0: curl_multi_perform (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0) ==7248== by 0x725700: ??? (in /usr/lib/ganeti/2.10/usr/sbin/ganeti-luxid) ==7248== 2014-06-23 17:26:38,584101000000 EEST: ganeti-luxid pid=7248 INFO Successfully handled Query ==7248== Invalid read of size 8 ==7248== at 0xAFCB7A: ??? (in /usr/lib/ganeti/2.10/usr/sbin/ganeti-luxid) ==7248== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==7248== ==7248== ==7248== Process terminating with default action of signal 11 (SIGSEGV) ==7248== Access not within mapped region at address 0x0 ==7248== at 0xAFCB7A: ??? (in /usr/lib/ganeti/2.10/usr/sbin/ganeti-luxid) ==7248== If you believe this happened as a result of a stack ==7248== overflow in your program's main thread (unlikely but ==7248== possible), you can try to increase the size of the ==7248== main thread stack using the --main-stacksize= flag. ==7248== The main thread stack size used in this run was 8388608. ==7248== ==7248== HEAP SUMMARY: ==7248== in use at exit: 397,811 bytes in 1,275 blocks ==7248== total heap usage: 8,921 allocs, 7,646 frees, 2,577,477 bytes allocated ==7248== ==7248== LEAK SUMMARY: ==7248== definitely lost: 0 bytes in 0 blocks ==7248== indirectly lost: 0 bytes in 0 blocks ==7248== possibly lost: 0 bytes in 0 blocks ==7248== still reachable: 397,811 bytes in 1,275 blocks ==7248== suppressed: 0 bytes in 0 blocks ==7248== Rerun with --leak-check=full to see details of leaked memory ==7248== ==7248== For counts of detected and suppressed errors, rerun with: -v ==7248== Use --track-origins=yes to see where uninitialised values come from ==7248== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 2 from 2) > > > - With the same version of libcurl (7.37.0) linked against gnutls 2.12, > > luxid > > works reliably. Actually the bug appeared when Debian switched to a > > gnutls > > 3.2-linked version of libcurl. I tried out different combinations of > > libcurl and GnuTLS versions, the only failing combinations were with > > GnuTLS 3.2. > > Andreas mentioned a bug sometime ago that depended on having an > application linked with both gnutls28 and gnutls26. Could that be the > issue in that case too? IIRC, this was due to symbol clashes. The binary indeed pulls in both as indirect dependencies, but gnutls28 has symbol versioning, so I wouldn't expect this to be an issue. Thanks! Apollon _______________________________________________ Gnutls-help mailing list [email protected] http://lists.gnupg.org/mailman/listinfo/gnutls-help
