On Tue, May 20, 2008 at 12:15 PM, Sam Clippinger <[EMAIL PROTECTED]> wrote: > Looks like an SSL connection problem between your secondary and primary > mail servers. When I've seen this in the past, it has often turned out > to be 32/64-bit issue. Once it was because the spamdyke binary was > compiled on a 32 bit server but run on a 64 bit server. Another time > the 32 bit OpenSSL libraries were installed on a 64 bit server. Are you > compiling spamdyke on the same server that it's running on? Are the > primary and secondary mail servers running the same version of OpenSSL?
Well recompiling without TLS had no effect; just had a user report a duplicate. It was worth a shot, right? :) > > Another thought -- are you sure spamdyke has enough memory? If you're > using the "softlimit" program to constrain its memory usage, you might > try increasing the limit to see if it fixes this. The softlimit setting in my run file was set to 25MB. I remember having to play with this value when we switched in to a 64-bit box. Maybe, like you said, it is set to a "just enough" value and needs to be raised. I just raised it to 50MB on my primary mailserver (64-bit box) and up to 20MB on my secondaries (32-bit boxes). Keeping my fingers crossed. -ken > > FYI: Removing the "tls-certificate-file" option from spamdyke's > configuration will have the same effect as recompiling without TLS. :) > > -- Sam Clippinger > > Ken Schweigert wrote: >> On Fri, May 16, 2008 at 5:21 PM, Ken Schweigert <[EMAIL PROTECTED]> wrote: >> >>> On Fri, May 16, 2008 at 5:08 PM, Sam Clippinger <[EMAIL PROTECTED]> wrote: >>> >>>> It looks like something is taking a long time to respond to the end of >>>> the message -- do you have anything installed to run SpamAssassin or >>>> ClamAV before the message is accepted? In particular, sometimes ClamAV >>>> needs to be restarted after it's been running for a long time; it starts >>>> taking longer and longer to process messages. Because qmail is taking >>>> so long to respond, spamdyke is giving up and timing out (even though >>>> it's not the remote client's fault). >>>> >>> Thanks for the quick reply. I do use ClamAV, but not SpamAssassin. >>> ClamAV has been running for quite a while now so I restarted it. >>> Hopefully that is all that it is. >>> >>> Thanks for the tip! >>> -ken >>> >>> >>>> I've fixed this particular situation in version 4.0.0, so that delays >>>> from qmail won't count against the idle timeout. >>>> >>>> -- Sam Clippinger >>>> >> >> The plot thickens. :) >> >> Since restarting ClamAV my users have complained that they are getting >> even more duplicates; sometimes 3 or 4 copies of the same message. >> >> So I was grep'ing through the logs and saw some timeouts coming from >> one of my secondary MX servers (also set up with spamdyke, but no clam >> or spamassassin). So I logged in to that secondary and went through >> the logs and saw this group of entries: >> >> 2008-05-19 18:44:48.055049500 delivery 271993: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:44:54.401082500 delivery 271994: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:09.470684500 delivery 271996: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:29.397628500 delivery 271998: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:29.435189500 delivery 271997: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:29.489066500 delivery 271999: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:30.476119500 delivery 272000: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:31.057934500 delivery 272001: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> 2008-05-19 18:45:39.449765500 delivery 272002: deferral: >> Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/ >> >> I looked and it looks like I compiled both server's spamdykes with the >> TLS option. I just recompiled with the '--disable-tls' configure >> option and then stopped qmail-smtpd and installed the new binary. I >> did this on my primary and both secondaries. So, keeping my fingers >> crossed that this fixes it. If you can think of anything else I >> should do or try please let me know. >> >> -ken >> >> >>>> Ken Schweigert wrote: >>>> >>>>> So this is a problem that has been intermittent for a month or so and >>>>> I haven't really been able to find a way to debug it until I found the >>>>> option 'full-log-dir' in spamdyke.conf. >>>>> >>>>> My problem has been my users will occasionally complain that they have >>>>> gotten duplicates of the same message. It happens sporadically and >>>>> seemingly without pattern. If I looked at the headers on both >>>>> messages everything is the same exact for the top/last Received: line >>>>> which is usually a few minutes different; the message-ids were the >>>>> same in both messages. It was my suspicion that the sending >>>>> mailserver started the conversation, sent all of the message, but for >>>>> some reason didn't receive the close/end/quit command and never >>>>> removed the message for it's queue, however the receiving server had >>>>> enough of a message to actually deliver it to my user. >>>>> >>>>> So I enabled the full-log-dir feature and waited for a user to have a >>>>> duplicate. I finally got one and it looks like the problem is with a >>>>> timeout. Below is the edited contents of file: >>>>> >>>>> ############################################# >>>>> [EMAIL PROTECTED] ~]# more >>>>> 20080515_152028_64.18.2.171_exprod7og109.obsmtp.com >>>>> 05/15/2008 15:20:28 STARTED: VERSION = 3.1.6+TLS, PID = 23327 >>>>> 05/15/2008 15:20:28 LEGEND: To remote host = <<< ; to child process = >>>>> >>>>> >>>>>>>> ; blocked by filter = <XX >>>>>>>> >>>>>>>> >>>>> 05/15/2008 15:20:28 LEGEND: From filter to remote host = <FF ; from >>>>> filter to child process = FF> >>>>> >>>>> <<< 05/15/2008 15:20:28 >>>>> 220 rsmail.mymailserver.tld ESMTP >>>>> >>>>> >>>>> >>>>>>>> 05/15/2008 15:20:28 >>>>>>>> >>>>>>>> >>>>> HELO exprod7og109.obsmtp.com >>>>> >>>>> <<< 05/15/2008 15:20:28 >>>>> 250 rsmail.mymailserver.tld >>>>> >>>>> >>>>> >>>>>>>> 05/15/2008 15:20:28 >>>>>>>> >>>>>>>> >>>>> MAIL FROM:<[EMAIL PROTECTED]> >>>>> >>>>> <<< 05/15/2008 15:20:28 >>>>> 250 ok >>>>> >>>>> >>>>> >>>>>>>> 05/15/2008 15:20:28 >>>>>>>> >>>>>>>> >>>>> RCPT TO:<[EMAIL PROTECTED]> >>>>> >>>>> <<< 05/15/2008 15:20:28 >>>>> 250 ok >>>>> >>>>> >>>>> >>>>>>>> 05/15/2008 15:20:28 >>>>>>>> >>>>>>>> >>>>> RCPT TO:<[EMAIL PROTECTED]> >>>>> >>>>> <<< 05/15/2008 15:20:28 >>>>> 250 ok >>>>> >>>>> >>>>> >>>>>>>> 05/15/2008 15:20:29 >>>>>>>> >>>>>>>> >>>>> DATA >>>>> >>>>> <<< 05/15/2008 15:20:29 >>>>> 354 go ahead >>>>> >>>>> >>>>> >>>>>>>> 05/15/2008 15:20:29 >>>>>>>> >>>>>>>> >>>>> Received: from source ([63.90.125.11]) by exprod7ob109.postini.com >>>>> ([64.18.6.12]) with SMTP; >>>>> Thu, 15 May 2008 12:20:18 PDT >>>>> Received: from mail.shrr.com ([63.90.125.10]) by mail.shrr.com with >>>>> Microsoft SMTPSVC(6.0.3790.3959); >>>>> Thu, 15 May 2008 15:17:43 -0400 >>>>> X-MimeOLE: Produced By Microsoft Exchange V6.5 >>>>> Content-class: urn:content-classes:message >>>>> MIME-Version: 1.0 >>>>> Content-Type: multipart/alternative; >>>>> >>>>> ... [ snipped message content ]... >>>>> >>>>> ------_=_NextPart_001_01C8B6C0.59C6D54D-- >>>>> . >>>>> >>>>> FF> 05/15/2008 15:23:30 >>>>> . >>>>> QUIT >>>>> >>>>> <FF 05/15/2008 15:23:30 >>>>> 421 Timeout. Talk faster next time. >>>>> >>>>> <XX 05/15/2008 15:34:10 >>>>> 250 ok 1210880050 qp 23364 >>>>> 502 unimplemented (#5.5.1) >>>>> 221 rsmail.mymailserver.tld >>>>> >>>>> 05/15/2008 15:34:10 CLOSED >>>>> [EMAIL PROTECTED] ~]# >>>>> ############################################# >>>>> >>>>> In a previous thread, a subscriber recommended adjusting the qmail >>>>> timeout to be lower than spamdyke's. Here is what the timeout >>>>> settings are on on my server: >>>>> >>>>> [EMAIL PROTECTED] ~]# more /var/qmail/control/timeoutsmtpd >>>>> 120 >>>>> [EMAIL PROTECTED] ~]# grep -i timeout /etc/spamdyke.conf >>>>> # connection-timeout-secs=0 is disabling the feature >>>>> connection-timeout-secs=0 >>>>> idle-timeout-secs=180 >>>>> [EMAIL PROTECTED] ~]# >>>>> >>>>> So it looks like the actual message data stopped sending at 05/15/2008 >>>>> 15:23:30 and got the "talk faster" error but didn't actually close the >>>>> connection until 05/15/2008 15:34:10. >>>>> >>>>> So any ideas what I can try next to stop these duplicates? >>>>> >>>>> -ken >>>>> >>>>> >>>> _______________________________________________ >>>> spamdyke-users mailing list >>>> [email protected] >>>> http://www.spamdyke.org/mailman/listinfo/spamdyke-users >>>> >>>> >>> >>> -- >>> Have a nice day ... unless you've made other plans. >>> >>> >> >> >> >> > _______________________________________________ > spamdyke-users mailing list > [email protected] > http://www.spamdyke.org/mailman/listinfo/spamdyke-users > -- Have a nice day ... unless you've made other plans. _______________________________________________ spamdyke-users mailing list [email protected] http://www.spamdyke.org/mailman/listinfo/spamdyke-users
