Sam writes:
 > Well, that's still 90% better than what Qmail does.  And, with mailing
 > lists being managed in one place, that goes up to 100%.  There is no
 > concept of a "workgroup" versus "enterprise" server.

Think "university politics", and translate "workgroup" into "department".

 > Eliminate-dups is a solution in search of a problem.  Duplicates due to
 > SMTP window failures are mostly theoretical than anything else.

Nope.  See RFC1047 ("Duplicate messages and SMTP.")  Also see the
quoted message from this very list over a year ago, and my response to 
it.

 > I'm not comfortable with the notion that the way to eliminate duplicates
 > with 100% certainty is, first, to generate a whole bunch of them, and then
 > to eliminate them on the delivery end.  Seems to be a bit wasteful to me.

You're trying to argue against the end-to-end principle.  You're
wasting your time.

From: Russell Nelson <[EMAIL PROTECTED]>
To: Qmail <[EMAIL PROTECTED]>
Subject: Re: close() bug in qmail-remote.
Date: Sun, 22 Mar 98 21:32:08 EST

Sam writes:
 > There's a minor bug in qmail-remote.  After the receiving server
 > acknowledged a successfull DATA transaction, if there's a TCP/IP problem
 > that prevents a successfull QUIT and then a close, qmail believes that the
 > message has not been sent, and it will try again.
 > 
 > This often shows up here when I'm sending mail internationally, over flaky
 > links.  Quite often the acknowledgement to my CLOSE packet gets lost,
 > after a QUIT, and the socket remains in a CLOSE_WAIT state for an hour,
 > or so, qmail-remote aborts, and tries to redeliver the same message again.

ARRGGGGHHHHH!!!  That's the same SMTP protocol bug that I spoke about
in the message quoted below.  There is NO WAY to fix it on the
sender's side.  It can only be fixed on the recipient's side.  I use
the less paranoid version and as far as I can tell, have never lost
any real mail due to it.  I've lost repeated test emails of the form:
(echo test|mailsubj test nelson), but never any real mail.  Test mail
I can look up in the log file -- yes, their deletion is logged.

Code is at <http://www.qmail.org/eliminate-dups>.  I highly recommend
its use.

Date: Wed, 16 Jul 97 17:36:39 EDT
From: Russell Nelson <[EMAIL PROTECTED]>
To: Andi Gutmans <[EMAIL PROTECTED]>
Cc: [EMAIL PROTECTED]
Subject: SMTP protocol flaw
In-Reply-To: <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>
        <Pine.SOL.3.95.970716142310.22894H-100000@big>
        <[EMAIL PROTECTED]>

Andi Gutmans writes:
 > Well as good as qmail is (and I really like it) the thing which bothers me
 > most about qmail is if I mail blah@host and help@host then a person which
 > is on both of these lists get's the message twice. This just doesn't happen
 > with sendmail's multiple recipients delivery.

Please don't praise sendmail's multiple recipients delivery.  It's a
gross hack that serves only to disguise the problem.  It's like
painting rotten wood.  Sendmail doesn't deal with:
  o The same message posted to two lists expanded by different machines,
  o Replies sent to the list and the author, such as this one.
  o Messages duplicated by a flaw in the SMTP protocol, explained below.

The SMTP protocol has a problem in it.  Any reliable protocol has some
kind of serial number to prevent retransmissions from becoming
duplications if the final ack is lost.  "Duplications", eh?  Sounds
familiar?  Well, RFC821 is silent on the issue.  And there's a finite
chance for any piece of email to be duplicated.  The "ack" is the 250
Ok response to terminating the DATA portion of the mail with
crlf.crlf.  If a sender terminates the mail but doesn't receive the
ack, it has no choice but to retransmit the mail.

The solution is unfortunately complex.  Because mail can arrive out of
order, and can be removed from the mailbox before the duplicate
arrives, the MTA needs to keep track of and delete duplicate messages.
The MTA needs to keep a small database of messages that have been
received recently.

Dan has refused to provide a fix for this problem, saying that
deleting duplicates is the job of the MUA.  I disagree, because the
protocol failure is at the MTA level.

How might such a protocol fix be implemented?  The simplest solution I
can see is to keep two files of message hashes, in addition to the
mailbox.  If the incoming message's hash appears in either file,
delete it.  Otherwise add it to the newer file and deliver the mail.
Periodically, when the older file is "too old", move the new to the
old and create a new.

The message hashes could be constructed two ways.  One, by ignoring
only the most recently generated Received: lines.  Or two, by
considering only the Message-ID: line and the body of the message.
The first is more paranoid, and only deletes actual SMTP protocol
failures.  The second is less paranoid and deletes the other two types
of duplicates discussed above.  There are reasons to select either
type.

-- 
-russ nelson <[EMAIL PROTECTED]>  http://russnelson.com
Crynwr sells support for free software  | PGPok | Government schools are so
521 Pleasant Valley Rd. | +1 315 268 1925 voice | bad that any rank amateur
Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   | can outdo them. Homeschool!

Reply via email to