On 06/29/2011 10:53 PM, Michael Scheidell wrote: > > > On 6/29/11 3:29 PM, Török Edwin wrote: >> (gdb) backtrace full > (gdb) backtrace full > #0 0x00000008018baf4a in __error () from /lib/libthr.so.3 > No symbol table info available. > #1 0x00000008018bac3b in __error () from /lib/libthr.so.3 > No symbol table info available. > #2 0x00000008018b66c5 in pthread_mutex_getyieldloops_np () from > /lib/libthr.so.3 > No symbol table info available. > #3 0x0000000800790892 in cli_vm_execute_jit () from > /usr/local/lib/libclamav.so.7 > No symbol table info available.
Thanks, can you try if this patch helps (to be applied on top of unmodified 0.97.1): http://git.clamav.net/gitweb?p=clamav-devel.git;a=commitdiff;h=bb5572cbe192471bfe7285f77661fff808c8a821 Your stacktraces showed several problems: - there are 14 threads all in __error() (errno), from cli_vm_execute_jit, that is probably returning from pthread_cond_timedwait - clamd's threads are missing, this is why it no longer responsed to anything, not even PING - all those threads seem to be looping trying to reacquire a mutex, but the thread that owns the mutex probably died already I couldn't get the bytecode_watchdog to crash on Linux/amd64, but valgrind showed me this warning: ==10908== Invalid read of size 8 ==10908== at 0x333740B58E: pthread_cond_timedwait@@GLIBC_2.3.2 (pthread_cond_timedwait.S:147) ==10908== by 0x4D55287: bytecode_watchdog(void*) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.10) ==10908== by 0x3337406B3F: start_thread (pthread_create.c:304) ==10908== by 0x33368D52FC: clone (clone.S:112) ==10908== Address 0xed00578 is not stack'd, malloc'd or (recently) free'd ==10908== ==10908== Syscall param futex(timeout) points to unaddressable byte(s) ==10908== at 0x333740B63B: pthread_cond_timedwait@@GLIBC_2.3.2 (pthread_cond_timedwait.S:216) ==10908== by 0x4D55287: bytecode_watchdog(void*) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.10) ==10908== by 0x3337406B3F: start_thread (pthread_create.c:304) ==10908== by 0x33368D52FC: clone (clone.S:112) ==10908== Address 0xed00578 is not stack'd, malloc'd or (recently) free'd So the bug is that pthread_cond_timedwait's timeout parameter became invalid. The patch above does 2 things: - if pthread_cond_timedwait returns any error (other than ETIMEDOUT) it logs it, and breaks the loop - make sure pthread_cond_timedwait's timeout parameter is valid until pthread_cond_timedwait wakes up Best regards, --Edwin _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml
