On Fri, May 16, 2008 at 5:21 PM, Ken Schweigert <[EMAIL PROTECTED]> wrote:
> On Fri, May 16, 2008 at 5:08 PM, Sam Clippinger <[EMAIL PROTECTED]> wrote:
>> It looks like something is taking a long time to respond to the end of
>> the message -- do you have anything installed to run SpamAssassin or
>> ClamAV before the message is accepted?  In particular, sometimes ClamAV
>> needs to be restarted after it's been running for a long time; it starts
>> taking longer and longer to process messages.  Because qmail is taking
>> so long to respond, spamdyke is giving up and timing out (even though
>> it's not the remote client's fault).
>
> Thanks for the quick reply.  I do use ClamAV, but not SpamAssassin.
> ClamAV has been running for quite a while now so I restarted it.
> Hopefully that is all that it is.
>
> Thanks for the tip!
> -ken
>
>>
>> I've fixed this particular situation in version 4.0.0, so that delays
>> from qmail won't count against the idle timeout.
>>
>> -- Sam Clippinger

The plot thickens.  :)

Since restarting ClamAV my users have complained that they are getting
even more duplicates; sometimes 3 or 4 copies of the same message.

So I was grep'ing through the logs and saw some timeouts coming from
one of my secondary MX servers (also set up with spamdyke, but no clam
or spamassassin).  So I logged in to that secondary and went through
the logs and saw this group of entries:

2008-05-19 18:44:48.055049500 delivery 271993: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:44:54.401082500 delivery 271994: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:09.470684500 delivery 271996: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:29.397628500 delivery 271998: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:29.435189500 delivery 271997: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:29.489066500 delivery 271999: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:30.476119500 delivery 272000: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:31.057934500 delivery 272001: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/
2008-05-19 18:45:39.449765500 delivery 272002: deferral:
Connected_to_xx.yy.zz.149_but_connection_died._Possible_duplicate!_error:1408F10B:SSL_routines:SSL3_GET_RECORD:wrong_version_number_(#4.4.2)/

I looked and it looks like I compiled both server's spamdykes with the
TLS option.  I just recompiled with the '--disable-tls' configure
option and then stopped qmail-smtpd and installed the new binary.  I
did this on my primary and both secondaries.  So, keeping my fingers
crossed that this fixes it.  If you can think of anything else I
should do or try please let me know.

-ken

>>
>> Ken Schweigert wrote:
>>> So this is a problem that has been intermittent for a month or so and
>>> I haven't really been able to find a way to debug it until I found the
>>> option 'full-log-dir' in spamdyke.conf.
>>>
>>> My problem has been my users will occasionally complain that they have
>>> gotten duplicates of the same message.  It happens sporadically and
>>> seemingly without pattern.  If I looked at the headers on both
>>> messages everything is the same exact for the top/last Received: line
>>> which is usually a few minutes different; the message-ids were the
>>> same in both messages.  It was my suspicion that the sending
>>> mailserver started the conversation, sent all of the message, but for
>>> some reason didn't receive the close/end/quit command and never
>>> removed the message for it's queue, however the receiving server had
>>> enough of a message to actually deliver it to my user.
>>>
>>> So I enabled the full-log-dir feature and waited for a user to have a
>>> duplicate.  I finally got one and it looks like the problem is with a
>>> timeout.  Below is the edited contents of file:
>>>
>>> #############################################
>>> [EMAIL PROTECTED] ~]# more 
>>> 20080515_152028_64.18.2.171_exprod7og109.obsmtp.com
>>> 05/15/2008 15:20:28 STARTED: VERSION = 3.1.6+TLS, PID = 23327
>>> 05/15/2008 15:20:28 LEGEND: To remote host = <<< ; to child process =
>>>
>>>>>> ; blocked by filter = <XX
>>>>>>
>>> 05/15/2008 15:20:28 LEGEND: From filter to remote host = <FF ; from
>>> filter to child process = FF>
>>>
>>> <<<  05/15/2008 15:20:28
>>> 220 rsmail.mymailserver.tld ESMTP
>>>
>>>
>>>>>>  05/15/2008 15:20:28
>>>>>>
>>> HELO exprod7og109.obsmtp.com
>>>
>>> <<<  05/15/2008 15:20:28
>>> 250 rsmail.mymailserver.tld
>>>
>>>
>>>>>>  05/15/2008 15:20:28
>>>>>>
>>> MAIL FROM:<[EMAIL PROTECTED]>
>>>
>>> <<<  05/15/2008 15:20:28
>>> 250 ok
>>>
>>>
>>>>>>  05/15/2008 15:20:28
>>>>>>
>>> RCPT TO:<[EMAIL PROTECTED]>
>>>
>>> <<<  05/15/2008 15:20:28
>>> 250 ok
>>>
>>>
>>>>>>  05/15/2008 15:20:28
>>>>>>
>>> RCPT TO:<[EMAIL PROTECTED]>
>>>
>>> <<<  05/15/2008 15:20:28
>>> 250 ok
>>>
>>>
>>>>>>  05/15/2008 15:20:29
>>>>>>
>>> DATA
>>>
>>> <<<  05/15/2008 15:20:29
>>> 354 go ahead
>>>
>>>
>>>>>>  05/15/2008 15:20:29
>>>>>>
>>> Received: from source ([63.90.125.11]) by exprod7ob109.postini.com
>>> ([64.18.6.12]) with SMTP;
>>>         Thu, 15 May 2008 12:20:18 PDT
>>> Received: from mail.shrr.com ([63.90.125.10]) by mail.shrr.com with
>>> Microsoft SMTPSVC(6.0.3790.3959);
>>>          Thu, 15 May 2008 15:17:43 -0400
>>> X-MimeOLE: Produced By Microsoft Exchange V6.5
>>> Content-class: urn:content-classes:message
>>> MIME-Version: 1.0
>>> Content-Type: multipart/alternative;
>>>
>>> ... [ snipped message content ]...
>>>
>>> ------_=_NextPart_001_01C8B6C0.59C6D54D--
>>> .
>>>
>>> FF> 05/15/2008 15:23:30
>>> .
>>> QUIT
>>>
>>> <FF  05/15/2008 15:23:30
>>> 421 Timeout. Talk faster next time.
>>>
>>> <XX  05/15/2008 15:34:10
>>> 250 ok 1210880050 qp 23364
>>> 502 unimplemented (#5.5.1)
>>> 221 rsmail.mymailserver.tld
>>>
>>> 05/15/2008 15:34:10 CLOSED
>>> [EMAIL PROTECTED] ~]#
>>> #############################################
>>>
>>> In a previous thread, a subscriber recommended adjusting the qmail
>>> timeout to be lower than spamdyke's.  Here is what the timeout
>>> settings are on on my server:
>>>
>>> [EMAIL PROTECTED] ~]# more /var/qmail/control/timeoutsmtpd
>>> 120
>>> [EMAIL PROTECTED] ~]# grep -i timeout /etc/spamdyke.conf
>>> # connection-timeout-secs=0 is disabling the feature
>>> connection-timeout-secs=0
>>> idle-timeout-secs=180
>>> [EMAIL PROTECTED] ~]#
>>>
>>> So it looks like the actual message data stopped sending at 05/15/2008
>>> 15:23:30 and got the "talk faster" error but didn't actually close the
>>> connection until 05/15/2008 15:34:10.
>>>
>>> So any ideas what I can try next to stop these duplicates?
>>>
>>> -ken
>>>
>> _______________________________________________
>> spamdyke-users mailing list
>> [email protected]
>> http://www.spamdyke.org/mailman/listinfo/spamdyke-users
>>
>
>
>
> --
> Have a nice day ... unless you've made other plans.
>



-- 
Have a nice day ... unless you've made other plans.
_______________________________________________
spamdyke-users mailing list
[email protected]
http://www.spamdyke.org/mailman/listinfo/spamdyke-users

Reply via email to