Re: [spamdyke-users] Multiple Zombie/Defunct qmail-smtpd Processes
I also have some zombies of my servers running Debian Sarge 32bit with tls activated. Sorry that I am not able to provide more information at this time. Hopefully I can investigate this issue soon. best, -harti On 01 Mar 08, Sam Clippinger wrote: Sorry for the late reply -- I'm currently on the road so email is a low priority this week. This is all very strange. If the debug version was running, I don't understand why the debug symbols didn't show up in the stack trace you sent. I also can't think of any other reasons why the OpenSSL library would hang on a read() call; I can only think to blame the 64 bit libraries but that's just a reflex, not an explanation. Without some way to reproduce this behavior, I'm not sure what else I can do to fix it. Are there any other commonalities you can find with these stalled processes? Do they seem to come from the same remote server(s)? Can you tell if the remote server(s) are running the same mail server software? Are the messages all small/large/junk/legitimate? Are the recipients the same each time? Anything along these lines? To answer your question: yes, if spamdyke is compiled without TLS support it will ignore the TLS options. -- Sam Clippinger Ken Schweigert wrote: On Mon, Feb 25, 2008 at 4:49 PM, Sam Clippinger [EMAIL PROTECTED] wrote: Hmmm. Well, it looks like the spamdyke process you attached to isn't running the binary with the debugging symbols; it probably started before you copied the new binary into place, but it doesn't matter. There's enough information to show what's going on. I did compile with the debug symbols. The resulting binary was 200K larger than the original so I knew there was something extra in there. Also it wouldn't let me overwrite the existing spamdyke binary because it was busy so I had to kill any processes that were running to put it in to place. The process is stuck inside the OpenSSL library, trying to negotiate an initial TLS connection with the remote host (gdb shows ssl23_get_client_hello() from /lib64/libssl.so.4). The negotiation is attempting to read data from the network (__read_nocancel() from /lib64/tls/libc.so.6) but there's no data available, so it's blocked forever, waiting on data that (apparently) never comes. If you enable full logging and examine the log from one of these sleeping processes, I'm sure you'd see a STARTTLS command followed by a Proceed response and nothing else. Obviously, this isn't supposed to happen -- the OpenSSL library has a timeout feature that defaults to 5 minutes. Since that's not working, I suspect there's something else going on. First, I'd check to make sure you've got the latest version of the OpenSSL library installed and the installed header files came from that version. In other words, check the version numbers on your openssl and openssl-dev RPM packages. I use Redhat's up2date utility and all packages are up to date. I've had times where features in the newest source weren't in the RPMs and had to compile openssl from source. Something I'm not afraid of doing and may consider doing in the future. Next, I notice your mail server is running a 64 bit RedHat release. Did you compile spamdyke on this machine? I recall once answering a report similar to yours where the spamdyke binary had been compiled on a 32 bit system and copied to the 64 bit system. If you're compiling elsewhere, both machines need to be 64 bit with the same version of the OpenSSL library. Yes, compiled on the server that it is running on. Not sure if there are any specific 64-bit flags that need to be set in 'configure' but I imagine they would have been picked up auto-magically. Lastly, are you using spamdyke for SMTPS (port 465)? It doesn't work well with SMTPS connections at the moment unless you run it from something like stunnel and remove the tls-certificate-file option from the configuration file. Not running on SMTPS. My users send through authenticated SMTP on 587. I have spamdyke protecting incoming smtp 25 but not on the 587 instance. My only other suggestion would be to compile spamdyke without TLS support (or remove the tls-certificate-file option) so that spamdyke doesn't attempt to accept the TLS connection itself. That would mean spamdyke wouldn't fully filter TLS connections but at least these sleeping processes would go away. I just recompiled spamdyke with './configure --disable-tls' . Is it safe to assume that since it's compiled this way that any configuration options about SSL will be ignored? Thanks for all your help! -ken I hope that helps! -- Sam Clippinger Ken Schweigert wrote: I've recompiled with debug symbols and have attached the output of what 'gdb' has. I have to admit that I don't really understand what is
Re: [spamdyke-users] Multiple Zombie/Defunct qmail-smtpd Processes
Sorry for the late reply -- I'm currently on the road so email is a low priority this week. This is all very strange. If the debug version was running, I don't understand why the debug symbols didn't show up in the stack trace you sent. I also can't think of any other reasons why the OpenSSL library would hang on a read() call; I can only think to blame the 64 bit libraries but that's just a reflex, not an explanation. Without some way to reproduce this behavior, I'm not sure what else I can do to fix it. Are there any other commonalities you can find with these stalled processes? Do they seem to come from the same remote server(s)? Can you tell if the remote server(s) are running the same mail server software? Are the messages all small/large/junk/legitimate? Are the recipients the same each time? Anything along these lines? To answer your question: yes, if spamdyke is compiled without TLS support it will ignore the TLS options. -- Sam Clippinger Ken Schweigert wrote: On Mon, Feb 25, 2008 at 4:49 PM, Sam Clippinger [EMAIL PROTECTED] wrote: Hmmm. Well, it looks like the spamdyke process you attached to isn't running the binary with the debugging symbols; it probably started before you copied the new binary into place, but it doesn't matter. There's enough information to show what's going on. I did compile with the debug symbols. The resulting binary was 200K larger than the original so I knew there was something extra in there. Also it wouldn't let me overwrite the existing spamdyke binary because it was busy so I had to kill any processes that were running to put it in to place. The process is stuck inside the OpenSSL library, trying to negotiate an initial TLS connection with the remote host (gdb shows ssl23_get_client_hello() from /lib64/libssl.so.4). The negotiation is attempting to read data from the network (__read_nocancel() from /lib64/tls/libc.so.6) but there's no data available, so it's blocked forever, waiting on data that (apparently) never comes. If you enable full logging and examine the log from one of these sleeping processes, I'm sure you'd see a STARTTLS command followed by a Proceed response and nothing else. Obviously, this isn't supposed to happen -- the OpenSSL library has a timeout feature that defaults to 5 minutes. Since that's not working, I suspect there's something else going on. First, I'd check to make sure you've got the latest version of the OpenSSL library installed and the installed header files came from that version. In other words, check the version numbers on your openssl and openssl-dev RPM packages. I use Redhat's up2date utility and all packages are up to date. I've had times where features in the newest source weren't in the RPMs and had to compile openssl from source. Something I'm not afraid of doing and may consider doing in the future. Next, I notice your mail server is running a 64 bit RedHat release. Did you compile spamdyke on this machine? I recall once answering a report similar to yours where the spamdyke binary had been compiled on a 32 bit system and copied to the 64 bit system. If you're compiling elsewhere, both machines need to be 64 bit with the same version of the OpenSSL library. Yes, compiled on the server that it is running on. Not sure if there are any specific 64-bit flags that need to be set in 'configure' but I imagine they would have been picked up auto-magically. Lastly, are you using spamdyke for SMTPS (port 465)? It doesn't work well with SMTPS connections at the moment unless you run it from something like stunnel and remove the tls-certificate-file option from the configuration file. Not running on SMTPS. My users send through authenticated SMTP on 587. I have spamdyke protecting incoming smtp 25 but not on the 587 instance. My only other suggestion would be to compile spamdyke without TLS support (or remove the tls-certificate-file option) so that spamdyke doesn't attempt to accept the TLS connection itself. That would mean spamdyke wouldn't fully filter TLS connections but at least these sleeping processes would go away. I just recompiled spamdyke with './configure --disable-tls' . Is it safe to assume that since it's compiled this way that any configuration options about SSL will be ignored? Thanks for all your help! -ken I hope that helps! -- Sam Clippinger Ken Schweigert wrote: I've recompiled with debug symbols and have attached the output of what 'gdb' has. I have to admit that I don't really understand what is exactly is represented, but it doesn't look like a lot. [EMAIL PROTECTED] ~]# gdb /usr/local/bin/spamdyke GNU gdb Red Hat Linux (6.3.0.0-1.153.el4_6.2rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are
Re: [spamdyke-users] Multiple Zombie/Defunct qmail-smtpd Processes
On Wed, Feb 20, 2008 at 10:59 PM, Sam Clippinger [EMAIL PROTECTED] wrote: Well, the zombie processes should be cleaned up by the operating system when spamdyke runs the waitpid() function inside its main loop. Obviously that function isn't being called, so spamdyke must be stuck somewhere. The S in the third column indicates spamdyke is sleeping (not looping), so it's most likely waiting on a read(). You may be able to use strace to capture the pattern of system calls before the process becomes stuck but that would require some pretty careful timing. A full log from one of these processes would probably be more helpful. Your configuration file and any errors from your mail logs could provide some clues too. If you have gdb installed on your system, you could also recompile spamdyke with debugging symbols and use the debugger to tell you exactly what's going on. That would be the most helpful for fixing it. To do that, recompile spamdyke with these commands: ./configure --with-debug make The resulting binary will be a little larger but it should function the same, even in production. When a process becomes stuck, find its process ID and start gdb, attach to the PID and ask for a stack dump: # gdb /usr/local/bin/spamdyke attach PID ^C where If you have any full logs, feel free to send them to me privately if you don't want to post them on the list. I've recompiled with debug symbols and have attached the output of what 'gdb' has. I have to admit that I don't really understand what is exactly is represented, but it doesn't look like a lot. [EMAIL PROTECTED] ~]# gdb /usr/local/bin/spamdyke GNU gdb Red Hat Linux (6.3.0.0-1.153.el4_6.2rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as x86_64-redhat-linux-gnu...(no debugging symbols found) Using host libthread_db library /lib64/tls/libthread_db.so.1. (gdb) attach 4113 Attaching to program: /usr/local/bin/spamdyke, process 4113 Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libnsl.so.1 Reading symbols from /lib64/libresolv.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libresolv.so.2 Reading symbols from /lib64/libcrypto.so.4...(no debugging symbols found)...done. Loaded symbols for /lib64/libcrypto.so.4 Reading symbols from /lib64/libssl.so.4...(no debugging symbols found)...done. Loaded symbols for /lib64/libssl.so.4 Reading symbols from /lib64/tls/libc.so.6... (no debugging symbols found)...done. Loaded symbols for /lib64/tls/libc.so.6 Reading symbols from /usr/lib64/libgssapi_krb5.so.2...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libgssapi_krb5.so.2 Reading symbols from /usr/lib64/libkrb5.so.3...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libkrb5.so.3 Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libcom_err.so.2 Reading symbols from /usr/lib64/libk5crypto.so.3... (no debugging symbols found)...done. Loaded symbols for /usr/lib64/libk5crypto.so.3 Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /usr/lib64/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libz.so.1 Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libnss_files.so.2... (no debugging symbols found)...done. Loaded symbols for /lib64/libnss_files.so.2 0x0035647b9e82 in __read_nocancel () from /lib64/tls/libc.so.6 (gdb) (gdb) Quit (gdb) where #0 0x0035647b9e82 in __read_nocancel () from /lib64/tls/libc.so.6 #1 0x003fba190fb1 in BIO_sock_should_retry () from /lib64/libcrypto.so.4 #2 0x003fba18f1d3 in BIO_read () from /lib64/libcrypto.so.4 #3 0x003fba41e9cf in ssl23_read_bytes () from /lib64/libssl.so.4 #4 0x003fba41d7fa in ssl23_get_client_hello () from /lib64/libssl.so.4 #5 0x003fba41dcad in ssl23_accept () from /lib64/libssl.so.4 #6 0x00415947 in ?? () #7 0x00403b64 in ?? () #8 0x0040649b in ?? () #9 0x00408ece in ?? () #10 0x0040ea67 in ?? () #11 0x00356471c3fb in __libc_start_main () from /lib64/tls/libc.so.6 #12 0x0040290a in ?? () #13 0x007fbc78 in ?? () #14 0x001c in ?? () #15 0x0006 in ?? () #16 0x007fbe23 in ?? () #17 0x007fbe3b in ?? () #18 0x007fbe3e in ?? () #19 0x007fbe51 in ?? ()