Looks like an SSL connection problem between your secondary and primary mail servers. When I've seen this in the past, it has often turned out to be 32/64-bit issue. Once it was because the spamdyke binary was compiled on a 32 bit server but run on a 64 bit server. Another time the 32 bit OpenSSL libraries were installed on a 64 bit server. Are you compiling spamdyke on the same server that it's running on? Are the primary and secondary mail servers running the same version of OpenSSL?
Another thought -- are you sure spamdyke has enough memory? If you're using the "softlimit" program to constrain its memory usage, you might try increasing the limit to see if it fixes this. FYI: Removing the "tls-certificate-file" option from spamdyke's configuration will have the same effect as recompiling without TLS. :) -- Sam Clippinger Ken Schweigert wrote: > On Fri, May 16, 2008 at 5:21 PM, Ken Schweigert <[EMAIL PROTECTED]> wrote: > >> On Fri, May 16, 2008 at 5:08 PM, Sam Clippinger <[EMAIL PROTECTED]> wrote: >> >>> It looks like something is taking a long time to respond to the end of >>> the message -- do you have anything installed to run SpamAssassin or >>> ClamAV before the message is accepted? In particular, sometimes ClamAV >>> needs to be restarted after it's been running for a long time; it starts >>> taking longer and longer to process messages. Because qmail is taking >>> so long to respond, spamdyke is giving up and timing out (even though >>> it's not the remote client's fault). >>> >> Thanks for the quick reply. I do use ClamAV, but not SpamAssassin. >> ClamAV has been running for quite a while now so I restarted it. >> Hopefully that is all that it is. >> >> Thanks for the tip! >> -ken >> >> >>> I've fixed this particular situation in version 4.0.0, so that delays >>> from qmail won't count against the idle timeout. >>> >>> -- Sam Clippinger >>> > > The plot thickens. :) > > Since restarting ClamAV my users have complained that they are getting > even more duplicates; sometimes 3 or 4 copies of the same message. > > So I was grep'ing through the logs and saw some timeouts coming from > one of my secondary MX servers (also set up with spamdyke, but no clam > or spamassassin). So I logged in to that secondary and went through > the logs and saw this group of entries: > > 2008-05-19 18:44:48.055049500 delivery 271993: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:44:54.401082500 delivery 271994: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:09.470684500 delivery 271996: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:29.397628500 delivery 271998: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:29.435189500 delivery 271997: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:29.489066500 delivery 271999: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:30.476119500 delivery 272000: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:31.057934500 delivery 272001: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > 2008-05-19 18:45:39.449765500 delivery 272002: deferral: > Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ > > I looked and it looks like I compiled both server's spamdykes with the > TLS option. I just recompiled with the '--disable-tls' configure > option and then stopped qmail-smtpd and installed the new binary. I > did this on my primary and both secondaries. So, keeping my fingers > crossed that this fixes it. If you can think of anything else I > should do or try please let me know. > > -ken > > >>> Ken Schweigert wrote: >>> >>>> So this is a problem that has been intermittent for a month or so and >>>> I haven't really been able to find a way to debug it until I found the >>>> option 'full-log-dir' in spamdyke.conf. >>>> >>>> My problem has been my users will occasionally complain that they have >>>> gotten duplicates of the same message. It happens sporadically and >>>> seemingly without pattern. If I looked at the headers on both >>>> messages everything is the same exact for the top/last Received: line >>>> which is usually a few minutes different; the message-ids were the >>>> same in both messages. It was my suspicion that the sending >>>> mailserver started the conversation, sent all of the message, but for >>>> some reason didn't receive the close/end/quit command and never >>>> removed the message for it's queue, however the receiving server had >>>> enough of a message to actually deliver it to my user. >>>> >>>> So I enabled the full-log-dir feature and waited for a user to have a >>>> duplicate. I finally got one and it looks like the problem is with a >>>> timeout. Below is the edited contents of file: >>>> >>>> ############################################# >>>> [EMAIL PROTECTED] ~]# more >>>> 20080515_152028_64.18.2.171_exprod7og109.obsmtp.com >>>> 05/15/2008 15:20:28 STARTED: VERSION = 3.1.6+TLS, PID = 23327 >>>> 05/15/2008 15:20:28 LEGEND: To remote host = <<< ; to child process = >>>> >>>> >>>>>>> ; blocked by filter = <XX >>>>>>> >>>>>>> >>>> 05/15/2008 15:20:28 LEGEND: From filter to remote host = <FF ; from >>>> filter to child process = FF> >>>> >>>> <<< 05/15/2008 15:20:28 >>>> 220 rsmail.mymailserver.tld ESMTP >>>> >>>> >>>> >>>>>>> 05/15/2008 15:20:28 >>>>>>> >>>>>>> >>>> HELO exprod7og109.obsmtp.com >>>> >>>> <<< 05/15/2008 15:20:28 >>>> 250 rsmail.mymailserver.tld >>>> >>>> >>>> >>>>>>> 05/15/2008 15:20:28 >>>>>>> >>>>>>> >>>> MAIL FROM:<[EMAIL PROTECTED]> >>>> >>>> <<< 05/15/2008 15:20:28 >>>> 250 ok >>>> >>>> >>>> >>>>>>> 05/15/2008 15:20:28 >>>>>>> >>>>>>> >>>> RCPT TO:<[EMAIL PROTECTED]> >>>> >>>> <<< 05/15/2008 15:20:28 >>>> 250 ok >>>> >>>> >>>> >>>>>>> 05/15/2008 15:20:28 >>>>>>> >>>>>>> >>>> RCPT TO:<[EMAIL PROTECTED]> >>>> >>>> <<< 05/15/2008 15:20:28 >>>> 250 ok >>>> >>>> >>>> >>>>>>> 05/15/2008 15:20:29 >>>>>>> >>>>>>> >>>> DATA >>>> >>>> <<< 05/15/2008 15:20:29 >>>> 354 go ahead >>>> >>>> >>>> >>>>>>> 05/15/2008 15:20:29 >>>>>>> >>>>>>> >>>> Received: from source ([63.90.125.11]) by exprod7ob109.postini.com >>>> ([64.18.6.12]) with SMTP; >>>> Thu, 15 May 2008 12:20:18 PDT >>>> Received: from mail.shrr.com ([63.90.125.10]) by mail.shrr.com with >>>> Microsoft SMTPSVC(6.0.3790.3959); >>>> Thu, 15 May 2008 15:17:43 -0400 >>>> X-MimeOLE: Produced By Microsoft Exchange V6.5 >>>> Content-class: urn:content-classes:message >>>> MIME-Version: 1.0 >>>> Content-Type: multipart/alternative; >>>> >>>> ... [ snipped message content ]... >>>> >>>> ------_=_NextPart_001_01C8B6C0.59C6D54D-- >>>> . >>>> >>>> FF> 05/15/2008 15:23:30 >>>> . >>>> QUIT >>>> >>>> <FF 05/15/2008 15:23:30 >>>> 421 Timeout. Talk faster next time. >>>> >>>> <XX 05/15/2008 15:34:10 >>>> 250 ok 1210880050 qp 23364 >>>> 502 unimplemented (#5.5.1) >>>> 221 rsmail.mymailserver.tld >>>> >>>> 05/15/2008 15:34:10 CLOSED >>>> [EMAIL PROTECTED] ~]# >>>> ############################################# >>>> >>>> In a previous thread, a subscriber recommended adjusting the qmail >>>> timeout to be lower than spamdyke's. Here is what the timeout >>>> settings are on on my server: >>>> >>>> [EMAIL PROTECTED] ~]# more /var/qmail/control/timeoutsmtpd >>>> 120 >>>> [EMAIL PROTECTED] ~]# grep -i timeout /etc/spamdyke.conf >>>> # connection-timeout-secs=0 is disabling the feature >>>> connection-timeout-secs=0 >>>> idle-timeout-secs=180 >>>> [EMAIL PROTECTED] ~]# >>>> >>>> So it looks like the actual message data stopped sending at 05/15/2008 >>>> 15:23:30 and got the "talk faster" error but didn't actually close the >>>> connection until 05/15/2008 15:34:10. >>>> >>>> So any ideas what I can try next to stop these duplicates? >>>> >>>> -ken >>>> >>>> >>> _______________________________________________ >>> spamdyke-users mailing list >>> [email protected] >>> http://www.spamdyke.org/mailman/listinfo/spamdyke-users >>> >>> >> >> -- >> Have a nice day ... unless you've made other plans. >> >> > > > > _______________________________________________ spamdyke-users mailing list [email protected] http://www.spamdyke.org/mailman/listinfo/spamdyke-users
