Ricardo Kleemann wrote: >Hello Mark, > >Thanks for your reply. > > >> >I'm running mailman on Ubuntu Hardy, and another weird thing is that >> >whenever I stop mailman, it always leaves at least one process hanging >> >around. I have to forcefully kill it. After I stop it, I still see: >> > >> >list 3833 0.0 0.4 83076 7540 ? Ss 08:29 >> >0:00 /usr/bin/python /usr/lib/mailman/bin/mailmanctl -s -q start >> >list 3842 0.1 2.5 105372 38148 ? S 08:29 >> >0:01 /usr/bin/python /var/lib/mailman/bin/qrunner >> >--runner=OutgoingRunner:0:1 -s
Note that the above "--runner=OutgoingRunner:0:1" indicates that OutgoingRunner is not sliced - more below. >> It appears that SMTPDirect (actually the underlying Python smtplib) is >> hung waiting for a response from the MTA that isn't coming. >> > >But the strange thing here is that I have no issues at all connecting to >the smtp server at localhost. Defaults.py has the standard config, using >localhost with SMTPDirect. What do you see in Mailman's smtp-failure log. >Whatever the OutgoingRunner is stuck on, it's definitely stuck. It won't >go away unless I do a kill -9 > >> >> >Even more strange when I reboot the machine, I'll see 2 entire sets of >> >mailman processes, almost as if the mailman start had been called twice. >> >> >> It seems like you have two init scripts for Mailman. >> >I would have thought so... but there's only 1 script under /etc/init.d/ >and no other scripts there reference mailman Well, something is starting Mailman twice. >> >In any case, right now it seems that mailman has stopped accepting >> >posts. Is there a way to get more debug from mailman when "mailman post" >> >is called? I don't see any errors, yet I don't see the post log file >> >updating. >> >> >> "mailman post" just puts the message in the in/ queue. I assume from >> what you say above that it gets processed by IncomingRunner and even >> archived and the problem is in OutgoingRunner. >> >> See the FAQs at <http://wiki.list.org/x/A4E9> and >> <http://wiki.list.org/x/-IA9>. >> > >I can see that the in queue is probably working. There are currently 129 >files in the out/ queue. > >There are 2 OutgoingRunner processes and apparently BOTH of them are >doing something because strace does show some activity Since OutgoingRunner is not sliced, there should be only one OutgoingRunner process. This needs to be corrected. See <http://wiki.list.org/x/_4A9>. >Process 14078 attached - interrupt to quit >recvfrom(7, "250 Ok. 0000000049303165.000045C"..., 8192, 0, NULL, NULL) >= 35 >sendto(7, "mail FROM:<[EMAIL PROTECTED]"..., 65, 0, NULL, 0) = 65 >recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(7, "rcpt TO:<[EMAIL PROTECTED]"..., 35, 0, NULL, 0) = 35 >recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(7, "rcpt TO:<[EMAIL PROTECTED]"..., 42, 0, NULL, 0) = 42 >recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(7, "rcpt TO:<[EMAIL PROTECTED]>\r\n", 31, 0, NULL, 0) = 31 >recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(7, "data\r\n", 6, 0, NULL, 0) = 6 >recvfrom(7, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(7, "Received: from sr05-01.mta.terra"..., 8256, 0, NULL, 0) = >8256 >recvfrom(7, "250 Ok. 0000000049303172.000045D"..., 8192, 0, NULL, NULL) >= 35 > > >[EMAIL PROTECTED]:/var/lib/mailman# strace -p13800 >Process 13800 attached - interrupt to quit >recvfrom(8, "250 Ok. 00000000493031A7.0000464"..., 8192, 0, NULL, NULL) >= 35 >sendto(8, "mail FROM:<[EMAIL PROTECTED]"..., 65, 0, NULL, 0) = 65 >recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(8, "rcpt TO:<[EMAIL PROTECTED]>\r\n", 29, 0, NULL, 0) = 29 >recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(8, "rcpt TO:<[EMAIL PROTECTED]>\r\n", 31, 0, NULL, 0) = 31 >recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(8, "rcpt TO:<[EMAIL PROTECTED]"..., 43, 0, NULL, 0) = 43 >recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9 >sendto(8, "data\r\n", 6, 0, NULL, 0) = 6 >recvfrom(8, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9 > > >Yet even though it's being processed, the logs/post file isn't getting >updated, and the number of messages in out/ doesn't decrease. So OutgoingRunner never finishes delivery of even one post. However, I see 'data' commands in the traces above, so presumably, some recipients are being delivered. Then when you kill OutgoingRunner and restart it, It recovers the .bak file from the queue and redelivers to the same recipients who by now have received multiple copies of the message. >Could this be because mailman is processing a very large list (20,000 >members) and it is just stuck on processing one message, while the other >messages wait around? It could be, but this would indicate that you need to do something to allow SMTP between Mailman and the MTA to proceed faster. Search the FAQ for "tuning". >But I've been handling these lists for a long time and never had these >problems. Mailman doesn't seem to be getting much cpu usage. > > -- Mark Sapiro <[EMAIL PROTECTED]> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://wiki.list.org/x/QIA9