Re: [Mailman-Users] Kernel update breaks Mailman!!
On 02/20/2014 02:04 PM, Lindsay Haisley wrote: Here's a sampling of the qrunner log from the wee hours, before I started poking at the problem to try to fix it: Feb 20 03:22:02 2014 (2447) IncomingRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2447) IncomingRunner qrunner exiting. Feb 20 03:22:02 2014 (2445) BounceRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2445) BounceRunner qrunner exiting. Feb 20 03:22:02 2014 (2446) CommandRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2446) CommandRunner qrunner exiting. Feb 20 03:22:02 2014 (2451) RetryRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2443) Master watcher caught SIGINT. Restarting. Feb 20 03:22:02 2014 (2444) ArchRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2444) ArchRunner qrunner exiting. Feb 20 03:22:02 2014 (2448) NewsRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2450) VirginRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2449) OutgoingRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2451) RetryRunner qrunner exiting. Feb 20 03:22:02 2014 (2448) NewsRunner qrunner exiting. Feb 20 03:22:02 2014 (2450) VirginRunner qrunner exiting. Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2445, sig: None, sts: 2, class: BounceRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2446, sig: None, sts: 2, class: CommandRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2451, sig: None, sts: 2, class: RetryRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2448, sig: None, sts: 2, class: NewsRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2444, sig: None, sts: 2, class: ArchRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2447, sig: None, sts: 2, class: IncomingRunner, slice: 1/1) [restarting] OK, From what you report here and elsewhere, it appears the issue was with OutgoingRunner not processing Mailman's 'out' queue. If the above log excerpt (appears to be from a mailmanctl restart) is complete, you will note that there are three entries for most runners, e,g. Feb 20 03:22:02 2014 (2447) IncomingRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2447) IncomingRunner qrunner exiting. Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2447, sig: None, sts: 2, class: IncomingRunner, slice: 1/1) [restarting] But there is only one for OutgoingRunner Feb 20 03:22:02 2014 (2449) OutgoingRunner qrunner caught SIGINT. Stopping. suggesting that it was hung and never terminated. Had it been me at that point, I would have stopped Mailman and made sure it was completely stopped per the FAQ at http://wiki.list.org/x/_4A9, and then started it to see if that fixed the problem. If the out queue were still not being processed, I would try to trace the OutgoingRunner process to see where it was hung. -- Mark Sapiro m...@msapiro.netThe highway is for gamblers, San Francisco Bay Area, Californiabetter use your sense - B. Dylan -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
Re: [Mailman-Users] Kernel update breaks Mailman!!
On Fri, 2014-02-21 at 05:56 -0800, Mark Sapiro wrote: OK, From what you report here and elsewhere, it appears the issue was with OutgoingRunner not processing Mailman's 'out' queue. If the above log excerpt (appears to be from a mailmanctl restart) is complete, you will note that there are three entries for most runners, e,g. Well I can't reproduce the Mailman list problem today. I (re)updated the kernel version to 3.2.0-59-generic x86_64 and everything seems to be working OK. The only obvious change is the aforementioned issue with the bind9 name server daemon which requires an explicit IPv4 network interface ACL instead of defaulting to receiving on ALL interfaces if the IPv4 ACL is absent in the config. I didn't update bind, so this is kernel dependent. It's possible that some features of v4 and v6 address handling have been merged (there's an IPv6 interface ACL in the bind9 config). It's also possible that attempts at name resolution hung up Mailman's qrunner. If it won't break, I can't fix it :-S Thanks for your time and attention. -- Lindsay Haisley | Everything works if you let it FMP Computer Services | 512-259-1190 | --- The Roadie http://www.fmp.com| -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
Re: [Mailman-Users] Kernel update breaks Mailman!!
On 02/21/2014 01:03 PM, Lindsay Haisley wrote: It's also possible that attempts at name resolution hung up Mailman's qrunner. That's a possibility, but the only name OutgoingRunner looks up is the value of SMTPHOST which in a more or less default Mailman is 'localhost' which should be resolved through /etc/hosts. It is also possible that OutgoingRunner got wedged for some transient reason unrelated to the kernel upgrade. If it won't break, I can't fix it :-S True ... -- Mark Sapiro m...@msapiro.netThe highway is for gamblers, San Francisco Bay Area, Californiabetter use your sense - B. Dylan -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
[Mailman-Users] Kernel update breaks Mailman!!
I'm running Mailman 2.1.15 on a Ubuntu server, feeding into Courier MTA, running Python 2.7.3. I track security updates and install them promptly when they're issued by Ubuntu. Yesterday I updated the Linux kernel from 3.2.0-58-generic (x86_64) to 3.2.0-59-generic and Mailman quit working. List posts made it through to the archives, and were apparently queued within Mailman, but wouldn't go out. The mail server was working OK for non-list email. Today I backed out the kernel update and posts to lists sent yesterday and today are going out without problems. I can find nothing in the Mailman logs, or in the mail server logs, indicating a problem. This is the first time I've ever had a problem with a server kernel update breaking something related to mail. Has anyone else seen this problem? Does anyone have any insight into how to address it? -- Lindsay Haisley | Everything works if you let it FMP Computer Services | 512-259-1190 | --- The Roadie http://www.fmp.com| -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
[Mailman-Users] Kernel update breaks Mailman!!
I'm running Mailman 2.1.15 on a Ubuntu server, feeding into Courier MTA, running Python 2.7.3. I track security updates and install them promptly when they're issued by Ubuntu. Yesterday I updated the Linux kernel from 3.2.0-58-generic (x86_64) to 3.2.0-59-generic and Mailman quit working. List posts made it through to the archives, and were apparently queued within Mailman, but wouldn't go out. The mail server was working OK for non-list email. Today I backed out the kernel update and posts to lists sent yesterday and today are going out without problems. I can find nothing in the Mailman logs, or in the mail server logs, indicating a problem. This is the first time I've ever had a problem with a server kernel update breaking something related to mail. Has anyone else seen this problem? Does anyone have any insight into how to address it? -- Lindsay Haisley | Everything works if you let it FMP Computer Services | 512-259-1190 | --- The Roadie http://www.fmp.com| -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
Re: [Mailman-Users] Kernel update breaks Mailman!!
On 02/20/2014 10:07 AM, Lindsay Haisley wrote: I'm running Mailman 2.1.15 on a Ubuntu server, feeding into Courier MTA, running Python 2.7.3. I track security updates and install them promptly when they're issued by Ubuntu. Yesterday I updated the Linux kernel from 3.2.0-58-generic (x86_64) to 3.2.0-59-generic and Mailman quit working. List posts made it through to the archives, and were apparently queued within Mailman, but wouldn't go out. The mail server was working OK for non-list email. Today I backed out the kernel update and posts to lists sent yesterday and today are going out without problems. What's in Mailman's 'post' and 'smtp' logs for these messages. Are they timestamped before or after you backed out the update. If before, they were queued in the MTA. If after, they were in Mailman's 'out' queue. If the latter, what's in Mailman's 'qrunner' log related to OutgoingRunner. -- Mark Sapiro m...@msapiro.netThe highway is for gamblers, San Francisco Bay Area, Californiabetter use your sense - B. Dylan -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
Re: [Mailman-Users] Kernel update breaks Mailman!!
On Thu, 2014-02-20 at 10:37 -0800, Mark Sapiro wrote: On 02/20/2014 10:07 AM, Lindsay Haisley wrote: I'm running Mailman 2.1.15 on a Ubuntu server, feeding into Courier MTA, running Python 2.7.3. I track security updates and install them promptly when they're issued by Ubuntu. Yesterday I updated the Linux kernel from 3.2.0-58-generic (x86_64) to 3.2.0-59-generic and Mailman quit working. List posts made it through to the archives, and were apparently queued within Mailman, but wouldn't go out. The mail server was working OK for non-list email. Today I backed out the kernel update and posts to lists sent yesterday and today are going out without problems. What's in Mailman's 'post' and 'smtp' logs for these messages. Are they timestamped before or after you backed out the update. If before, they were queued in the MTA. If after, they were in Mailman's 'out' queue. They weren't in the MTA's queue. Looking at the count of messages in the MTA queue was how I determined that list posts weren't being delivered to the MTA by Mailman. I restarted qrunner and it didn't make any difference. The mail queue had like 67 messages in it. This would go up to 68 or 69 at time and then fall back down again - normal behavior. I could send and receive mail. All indications are that the MTA was working normally. Mailman lists run from several hundred to a couple of thousand subscribers and if someone posts to a list the MTA mail queue shoots up to hundreds of messages with VERP sender addresses shown in the queue summary, and then works its way back down. If the latter, what's in Mailman's 'qrunner' log related to OutgoingRunner. Here's a sampling of the qrunner log from the wee hours, before I started poking at the problem to try to fix it: Feb 20 03:22:02 2014 (2447) IncomingRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2447) IncomingRunner qrunner exiting. Feb 20 03:22:02 2014 (2445) BounceRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2445) BounceRunner qrunner exiting. Feb 20 03:22:02 2014 (2446) CommandRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2446) CommandRunner qrunner exiting. Feb 20 03:22:02 2014 (2451) RetryRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2443) Master watcher caught SIGINT. Restarting. Feb 20 03:22:02 2014 (2444) ArchRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2444) ArchRunner qrunner exiting. Feb 20 03:22:02 2014 (2448) NewsRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2450) VirginRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2449) OutgoingRunner qrunner caught SIGINT. Stopping. Feb 20 03:22:02 2014 (2451) RetryRunner qrunner exiting. Feb 20 03:22:02 2014 (2448) NewsRunner qrunner exiting. Feb 20 03:22:02 2014 (2450) VirginRunner qrunner exiting. Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2445, sig: None, sts: 2, class: BounceRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2446, sig: None, sts: 2, class: CommandRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2451, sig: None, sts: 2, class: RetryRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2448, sig: None, sts: 2, class: NewsRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2444, sig: None, sts: 2, class: ArchRunner, slice: 1/1) [restarting] Feb 20 03:22:02 2014 (2443) Master qrunner detected subprocess exit (pid: 2447, sig: None, sts: 2, class: IncomingRunner, slice: 1/1) [restarting] FWIW, another very strange thing happened after the kernel upgrade, totally unrelated to mail. I run bind9 on the same server, and it provides recursive DNS for all our in-house boxes coming from our LAN through our VPN to our server. This has been working fine for some time, but after the kernel upgrade it quit working. The bind9 config specifies that if there's no ACL in the bind config then bind listens on ALL interfaces. There was an interface ACL for IPv6 but none for v4. After the upgrade, bind no longer worked for us as our recursive server UNLESS I provided an v4 interface ACL, which I did, and it started working again. Go figure. -- Lindsay Haisley | Everything works if you let it FMP Computer Services | 512-259-1190 | --- The Roadie http://www.fmp.com| -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
Re: [Mailman-Users] Kernel update breaks Mailman!!
On Thu, 2014-02-20 at 14:39 -0500, Barry Warsaw wrote: I'm really quite surprised about this. From the kernel version numbers, I'm guessing you're running Ubuntu 12.04 LTS? You are correct about the Ubuntu release on the box. My next step would be to reinstall the new kernel version and test this issue in a controlled way, but before I do so I'd like to have some diagnostic code or configs in place which will track and log the intimate details of the relationship between Mailman and the MTA. Any suggestions will be appreciated! I'm also going to post to the very active courier-users list and see if anyone else is having this or similar problems. -- Lindsay Haisley | Everything works if you let it FMP Computer Services | 512-259-1190 | --- The Roadie http://www.fmp.com| -- Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org