[Mailman-Users] failing qrunner

2007-09-15 Thread Jaco Kroon
Hi guys,

We've got a problem with a half-completed delivery run, somehow an
address with a ? at the end of the domain managed to get into the list
addresses, ie, something like: [EMAIL PROTECTED] instead of just
[EMAIL PROTECTED] ... now exim drops the connection when it sees this
address, which means that none of the recipients in that run receives
the message.

Firstly, mailman should not have accepted that address, but this may
have been fixed (this is a rather old version, no, I can't upgrade it,
nor am I allowed to fix the exim config ... don't even bother asking).

What I want to know is how mailman handles the message delivery runs.
Afaik each message that needs to go out is stored in some location,
along with a list of recipients, so periodically mailman checks which
messages needs to go out, and to which recipients, and it then tries to
make those deliveries, removing the recipients that it successfully
delivers.  Is there a manual way to remove the problem-causing email
addy from this list for the particular message?  We've already removed
it from the main list so it won't cause issues in future but it's now
holding up the delivery of an already sent message.

Jaco

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] failing qrunner

2007-09-15 Thread Mark Sapiro
Jaco Kroon wrote:

What I want to know is how mailman handles the message delivery runs.
Afaik each message that needs to go out is stored in some location,
along with a list of recipients, so periodically mailman checks which
messages needs to go out, and to which recipients, and it then tries to
make those deliveries, removing the recipients that it successfully
delivers.


That is correct.

Assuming this is at least Mailman 2.1.x, the messages to be sent are
placed in Mailman's 'out' queue (normally Mailman's qfiles/out/
directory) and picked up and delivered by OutgoingRunner. If the MTA
returns a non-retryable failure for one or more recipients, that is
logged in Mailman's smtp-failure log and treated as a bounce for the
failed recipients.

If the MTA returns a retryable failure for one or more recipients, that
is also logged in Mailman's smtp-failure log and the message is queued
in the 'retry' queue for delivery to the failed recipients. Every 15
minutes, RetryRunner moves the message from the retry queue back to
the out queue.

This continues for DELIVERY_RETRY_PERIOD (default 5 days) after which,
Mailman gives up on this message.


Is there a manual way to remove the problem-causing email
addy from this list for the particular message?  We've already removed
it from the main list so it won't cause issues in future but it's now
holding up the delivery of an already sent message.

First find the entry (a long, mostly numeric, name ending in .pck) in
qfiles/retry, and move that file aside. Then use Mailman's bin/dumpdb
to dump the file. This will output the raw message and the message
metadata. The metadata contains a list of 'recips' which is the
addresses remaining to be delivered.

If you are proficient in Python, you could write a short script to
unpickle the message and metadata from the file, remove the bad
recipient from recips and repickle the message and metadata. then you
could put the file in qfiles/out for delivery. (I'm currently
debugging one I just wrote - I'll post a link soon).

Alternatively, you could just remail the message outside of mailman to
the remaining recipients.

-- 
Mark Sapiro [EMAIL PROTECTED]   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] failing qrunner

2007-09-15 Thread Mark Sapiro
Mark Sapiro wrote:

If you are proficient in Python, you could write a short script to
unpickle the message and metadata from the file, remove the bad
recipient from recips and repickle the message and metadata. then you
could put the file in qfiles/out for delivery. (I'm currently
debugging one I just wrote - I'll post a link soon).


The minimally tested script is at
http://veenet.value.net/~msapiro/scripts/remove_recips and mirrored
at http://fog.ccsf.edu/~msapiro/scripts/remove_recips.

-- 
Mark Sapiro [EMAIL PROTECTED]   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] failing qrunner

2007-09-15 Thread Mark Sapiro
Jaco Kroon wrote:

Mark Sapiro wrote:
 Jaco Kroon wrote:

Ok.  That covers the 4xx and 5xx responses to rcpt to:, what happens if
the MTA simply closes the connection?  What I gathered the smtp
conversation had to look like was something like:

S: 220 servername ESMTP Exim 
C: helo servername
S: 250 servername Hello localhost [127.0.0.1]
C: mail from: [EMAIL PROTECTED]
S: 250 OK
C: rcpt to: [EMAIL PROTECTED]
S: 250 OK
C: rcpt to: [EMAIL PROTECTED]
S: --- force close connection ---



It will be logged in the 'smtp-failure' as a 'Low level smtp error' and
in the 'post' log with the number refused. It shouldn't be retried.

What's in Mailman's 'smtp', 'smtp-failure' and 'post' logs?


Now, the problem here is that you don't really know whether it's a 5xx
or a 4xx error code, and it actually looks like the entire run for that
message gets interrupted and put to sleep in it's entirety.  Thus may
have been a bug that got fixed at some point (I don't even know which
exact version of mailman I'm working with, but it's at the latest
something released around Feb 2007).

So at this point it simply wouldn't continue any further, and
smtp-failures actually logs the address after the faulty one as the one
causing a problem.


It depends on what exception is returned by Python's smtplib. If Exim
really just closes the connection, it will be logged in 'post' with a
number of failures as well as being logged in 'smtp-failure' as a 'Low
level' error and in 'smtp', and each attempted recipient from that
transaction (all the ones up to SMTP_MAX_RCPTS (default 500) that were
going to be delivered, not just the ones whose rcpt to was not sent)
will be logged in 'smtp-failure' as 'code -1: error'. Then the message
will be put in the retry queue with the same recips list minus any
that were successfully delivered in a prior smtp transaction.

What is in the Mailman logs?


 This continues for DELIVERY_RETRY_PERIOD (default 5 days) after which,
 Mailman gives up on this message.
 
 
 Is there a manual way to remove the problem-causing email
 addy from this list for the particular message?  We've already removed
 it from the main list so it won't cause issues in future but it's now
 holding up the delivery of an already sent message.
 
 First find the entry (a long, mostly numeric, name ending in .pck) in
 qfiles/retry, and move that file aside. Then use Mailman's bin/dumpdb
 to dump the file. This will output the raw message and the message
 metadata. The metadata contains a list of 'recips' which is the
 addresses remaining to be delivered.

I saw the dumpdb program, had no idea what it does though.  Now I do,
and it'll make my life a lot easier next time.  Any way to repack the file?

 If you are proficient in Python, you could write a short script to
 unpickle the message and metadata from the file, remove the bad
 recipient from recips and repickle the message and metadata. then you
 could put the file in qfiles/out for delivery. (I'm currently
 debugging one I just wrote - I'll post a link soon).

... or issue mailmanctl stop, use vim on the file, find the invalid
address and without changing the size of the file change the address to
an RFC legal address that is bogus, ie, [EMAIL PROTECTED] can be changed
to [EMAIL PROTECTED] which causes the pickle to not break, and will
cause exim to not close the connection ... instead it will bounce back
to mailman, harmlessly since this server isn't using VERP.


Yes, you could do that, but as I posted later in this thread, there is
now a script to just delete the bad address at
http://veenet.value.net/~msapiro/scripts/remove_recips and mirrored
at http://fog.ccsf.edu/~msapiro/scripts/remove_recips.

And, if you really wanted to use vim, there's no need to stop Mailman.
Just move the file out of the queue directory, edit it, dump it with
bin/dumpdb to verify it can still be unpickled and move it back.

-- 
Mark Sapiro [EMAIL PROTECTED]   The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan

--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp


Re: [Mailman-Users] failing qrunner

2007-09-15 Thread Brad Knowles
On 9/15/07, Jaco Kroon wrote:

  So at this point it simply wouldn't continue any further, and
  smtp-failures actually logs the address after the faulty one as the one
  causing a problem.

To avoid this problem in the future, try enabling personalization on 
the list, and using VERP.  Then Mailman will make separate delivery 
attempts for each user, and only the invalid one would fail in the 
manner you described.  The rest should go through normally.


This would be a bigger performance hit on the server, but would help 
make your day-to-day operations more robust.

This is especially important since you've said you can't upgrade any 
of the software, and we know that more recent versions of Mailman 
have significantly improved their ability to handle failures of 
various different types and continue trying to deliver everything 
else.

-- 
Brad Knowles [EMAIL PROTECTED]
LinkedIn Profile: http://tinyurl.com/y8kpxu
--
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=showamp;file=faq01.027.htp