Re: [Mailman-Developers] qrunner infinite loop?
On 25 August 2003, Barry Warsaw said: > OutgoingRunner should be fairly sane since all it's doing is reading the > files from disk and spewing them over port 25 to your local smptd. It's > also doing logging so you could tail logs/{smtp,smtp-failure,post} to > watch it make progress. You should also see the qfiles/out directory > grow and shrink as files are consumed and unlinked, or new ones are > prepared for sending out. At one point, Mailman was running for several hours, and it consumed exactly as much CPU time as elapsed time since it was started. Sounds like an infinite loop to me. It turns out that Mailman (or possibly Exim) were misconfigured on the host in question (drydock.python.net, for context). Mailman had been configured to use "python2.net" for its email domain -- which is fine as far as DNS is concerned, but Exim on that host was *not* told that python2.net is one of its local domains. Here's what might have happened if Mailman had connected to Exim to send a message: mail from:<[EMAIL PROTECTED]> 250 OK rcpt to:<[EMAIL PROTECTED]> 451 Temporary local problem - please try later Two glitches with that theory though: * I ran "tcpdump -i lo" while qrunner was in its infinite-looking loop, and there was no traffic -- so Mailman was apparently not connecting to Exim * between the "rcpt to" and the 451 is a noticeable delay -- between 0.5 and 1.0 sec I would say. Not sure what Exim is doing there (maybe DNS is slow on this host?), but qrunner should not be consuming CPU while waiting for Exim's 451. Anyways, I fixed Mailman's configuration and deleted qfiles/*/*, and now qrunner runs happily. Well, it least it doesn't suck CPU. And I have a clue how to reproduce this problem in case anyone cares to give it a shot. Greg -- Greg Ward <[EMAIL PROTECTED]> http://www.gerg.ca/ Condense soup, not books! ___ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers
Re: [Mailman-Developers] qrunner infinite loop?
On Sun, 2003-08-24 at 20:49, Greg Ward wrote: > I've just installed the mailman 2.1.2-7 package on a Debian sarge > (testing) system, and qrunner goes into an apparent infinite loop as > soon as it starts. Here's a snapshot from "top" shortly after running > "/etc/init.d/mailman start": > > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ Command > 4498 list 19 0 4656 4656 1844 R 99.7 0.5 0:48.36 python > 4484 gward 9 0 940 940 748 R 0.3 0.1 0:00.17 top > > Memory use is stable, hence my assumption that this is a garden-variety > infinite loop. > > And "ps -fu list" reports this: > > UIDPID PPID C STIME TTY TIME CMD > list 4492 1 0 02:42 ?00:00:00 [mailmanctl] > list 4493 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner > --runner=ArchRunner:0:1 -s > list 4494 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner > --runner=BounceRunner:0:1 -s > list 4495 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner > --runner=CommandRunner:0:1 -s > list 4496 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner > --runner=IncomingRunner:0:1 -s > list 4497 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner > --runner=NewsRunner:0:1 -s > list 4498 4492 97 02:42 ?00:01:01 qrunner /var/lib/mailman/bin/qrunner > --runner=OutgoingRunner:0:1 -s > list 4499 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner > --runner=VirginRunner:0:1 -s > > ... so it's just one qrunner (OutgoingRunner) that's responsible for all > the CPU sucking. > > Hmmm: there *are* files in the queue, but I'm not sure what they're > doing there or how they got there (this is a new, fairly inactive system > with no lists and no incoming traffic, apart from me testing things): > > # cd /var/lib/mailman/ > # find qfiles/ -type f > qfiles/out/1061532001.524745+03e15ef2a9a7d27904406a659cdb4922a204431b.pck > qfiles/out/1061497885.874818+cf5f7e2e03cbca8bb84a23a60f055679c16e81fa.pck > qfiles/out/1061704802.107101+4617d8731ecb303dd756e475b5362bd5a64b1fcf.pck > qfiles/out/1061497885.871999+3453d7caa05a1aad6b230da61b39d4937405fcc8.pck > qfiles/out/1061497885.871999+3453d7caa05a1aad6b230da61b39d4937405fcc8.db > qfiles/out/1061497885.874818+cf5f7e2e03cbca8bb84a23a60f055679c16e81fa.db > qfiles/out/1061704802.107101+4617d8731ecb303dd756e475b5362bd5a64b1fcf.db > qfiles/out/1061532001.524745+03e15ef2a9a7d27904406a659cdb4922a204431b.db > > Any clues? I'll try to investigate more tomorrow evening. While I have seen qrunners suck up all available cpu occasionally, it's usually for manageable periods of time. I haven't actually seen true infloops, although I won't discount it. ArchRunner and BounceRunner are the worst offenders, the former because Pipermail is very inefficient, and the latter because of the stupid way bounces are registered. I actually have a fix for the BounceRunner issue in cvs, which I intend on installing on python.org to test out . OutgoingRunner should be fairly sane since all it's doing is reading the files from disk and spewing them over port 25 to your local smptd. It's also doing logging so you could tail logs/{smtp,smtp-failure,post} to watch it make progress. You should also see the qfiles/out directory grow and shrink as files are consumed and unlinked, or new ones are prepared for sending out. -Barry ___ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers
[Mailman-Developers] qrunner infinite loop?
I've just installed the mailman 2.1.2-7 package on a Debian sarge (testing) system, and qrunner goes into an apparent infinite loop as soon as it starts. Here's a snapshot from "top" shortly after running "/etc/init.d/mailman start": PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ Command 4498 list 19 0 4656 4656 1844 R 99.7 0.5 0:48.36 python 4484 gward 9 0 940 940 748 R 0.3 0.1 0:00.17 top Memory use is stable, hence my assumption that this is a garden-variety infinite loop. And "ps -fu list" reports this: UIDPID PPID C STIME TTY TIME CMD list 4492 1 0 02:42 ?00:00:00 [mailmanctl] list 4493 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner --runner=ArchRunner:0:1 -s list 4494 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner --runner=BounceRunner:0:1 -s list 4495 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner --runner=CommandRunner:0:1 -s list 4496 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner --runner=IncomingRunner:0:1 -s list 4497 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner --runner=NewsRunner:0:1 -s list 4498 4492 97 02:42 ?00:01:01 qrunner /var/lib/mailman/bin/qrunner --runner=OutgoingRunner:0:1 -s list 4499 4492 0 02:42 ?00:00:00 qrunner /var/lib/mailman/bin/qrunner --runner=VirginRunner:0:1 -s ... so it's just one qrunner (OutgoingRunner) that's responsible for all the CPU sucking. Hmmm: there *are* files in the queue, but I'm not sure what they're doing there or how they got there (this is a new, fairly inactive system with no lists and no incoming traffic, apart from me testing things): # cd /var/lib/mailman/ # find qfiles/ -type f qfiles/out/1061532001.524745+03e15ef2a9a7d27904406a659cdb4922a204431b.pck qfiles/out/1061497885.874818+cf5f7e2e03cbca8bb84a23a60f055679c16e81fa.pck qfiles/out/1061704802.107101+4617d8731ecb303dd756e475b5362bd5a64b1fcf.pck qfiles/out/1061497885.871999+3453d7caa05a1aad6b230da61b39d4937405fcc8.pck qfiles/out/1061497885.871999+3453d7caa05a1aad6b230da61b39d4937405fcc8.db qfiles/out/1061497885.874818+cf5f7e2e03cbca8bb84a23a60f055679c16e81fa.db qfiles/out/1061704802.107101+4617d8731ecb303dd756e475b5362bd5a64b1fcf.db qfiles/out/1061532001.524745+03e15ef2a9a7d27904406a659cdb4922a204431b.db Any clues? I'll try to investigate more tomorrow evening. Greg -- Greg Ward <[EMAIL PROTECTED]> http://www.gerg.ca/ We have always been at war with Oceania. ___ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers