There is a bug report at
<https://sourceforge.net/tracker/index.php?func=detail&aid=1656289&group_id=103&atid=100103>.
The basic issue here is there is an extremely large message in the 'in'
queue. In followup emails with the submitter, he said "The file was
nearly half a terabyte in size".
Not surprisingly, IncomingRunner threw a MemoryError when SpamDetect
tried to flatten() the message, but then Runner tried to shunt the
message and Switchboard.enqueue() threw another MemoryError in the
attempt to pickle the new shunt queue entry.
The second MemoryError is uncaught in Runner._oneloop, so it causes the
main loop in Runner.run to exit.
Pre-Mailman 2.1.9, the master just restarts IncomingRunner - the
message is lost, but everything else is OK.
Because of the changes in 2.1.9 to prevent message loss in case of
disaster, there is now a .bak file left in the 'in' queue. When
IncomingRunner restarts, it recovers the .bak file and the whole
scenario repeats until the master reaches MAX_RESTARTS on
IncomingRunner and we are left with no IncomingRunner and the .bak
file still in the 'in' queue.
In order to fix this, I suggest we protect the shunt enqueue in a try.
I have worked up a patch for this which is attached. This patch also
adds a 'preserve' argument to Switchboard.finish such that if it is
called with preserve=True, instead of removing the .bak file, it just
renames it with a .psv extension. These changes ensure that
IncomingRunner doesn't exit, and no .bak file is left to cause a
subsequent problem while still preserving the original queue entry for
further analysis if possible.
I would like some feedback on whether or not this is the right approach.
--
Mark Sapiro <[EMAIL PROTECTED]> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
Index: Runner.py
===================================================================
--- Runner.py (revision 8147)
+++ Runner.py (working copy)
@@ -1,4 +1,4 @@
-# Copyright (C) 1998-2006 by the Free Software Foundation, Inc.
+# Copyright (C) 1998-2007 by the Free Software Foundation, Inc.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
@@ -122,9 +122,22 @@
self._log(e)
# Put a marker in the metadata for unshunting
msgdata['whichq'] = self._switchboard.whichq()
- new_filebase = self._shunt.enqueue(msg, msgdata)
- syslog('error', 'SHUNTING: %s', new_filebase)
- self._switchboard.finish(filebase)
+ # It is possible that shunting can throw an exception, e.g. a
+ # permissions problem or a MemoryError due to a really large
+ # message. Try to be graceful.
+ try:
+ new_filebase = self._shunt.enqueue(msg, msgdata)
+ syslog('error', 'SHUNTING: %s', new_filebase)
+ self._switchboard.finish(filebase)
+ except Exception, e:
+ # The message wasn't successfully shunted. Log the
+ # exception and try to preserve the original queue entry
+ # for possible analysis.
+ self._log(e)
+ syslog('error',
+ 'SHUNTING FAILED, preserving original entry: %s',
+ filebase)
+ self._switchboard.finish(filebase, preserve=True)
# Other work we want to do each time through the loop
Utils.reap(self._kids, once=True)
self._doperiodic()
Index: Switchboard.py
===================================================================
--- Switchboard.py (revision 8147)
+++ Switchboard.py (working copy)
@@ -1,4 +1,4 @@
-# Copyright (C) 2001-2006 by the Free Software Foundation, Inc.
+# Copyright (C) 2001-2007 by the Free Software Foundation, Inc.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
@@ -164,12 +164,17 @@
msg = email.message_from_string(msg, Message.Message)
return msg, data
- def finish(self, filebase):
+ def finish(self, filebase, preserve=False):
bakfile = os.path.join(self.__whichq, filebase + '.bak')
try:
- os.unlink(bakfile)
+ if preserve:
+ psvfile = os.path.join(self.__whichq, filebase + '.psv')
+ os.rename(bakfile, psvfile)
+ else:
+ os.unlink(bakfile)
except EnvironmentError, e:
- syslog('error', 'Failed to unlink backup file: %s', bakfile)
+ syslog('error', 'Failed to unlink/preserve backup file: %s',
+ bakfile)
def files(self, extension='.pck'):
times = {}
_______________________________________________
Mailman-Developers mailing list
[email protected]
http://mail.python.org/mailman/listinfo/mailman-developers
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives:
http://www.mail-archive.com/mailman-developers%40python.org/
Unsubscribe:
http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org
Security Policy:
http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp