Ricardo Kleemann wrote:

>Hello Mark,
>
>Thanks for your reply.
>
>
>> >I'm running mailman on Ubuntu Hardy, and another weird thing is that
>> >whenever I stop mailman, it always leaves at least one process hanging
>> >around. I have to forcefully kill it. After I stop it, I still see:
>> >
>> >list      3833  0.0  0.4  83076  7540 ?        Ss   08:29
>> >0:00 /usr/bin/python /usr/lib/mailman/bin/mailmanctl -s -q start
>> >list      3842  0.1  2.5 105372 38148 ?        S    08:29
>> >0:01 /usr/bin/python /var/lib/mailman/bin/qrunner
>> >--runner=OutgoingRunner:0:1 -s


Note that the above "--runner=OutgoingRunner:0:1" indicates that
OutgoingRunner is not sliced - more below.


>> It appears that SMTPDirect (actually the underlying Python smtplib) is
>> hung waiting for a response from the MTA that isn't coming.
>> 
>
>But the strange thing here is that I have no issues at all connecting to
>the smtp server at localhost. Defaults.py has the standard config, using
>localhost with SMTPDirect.


What do you see in Mailman's smtp-failure log.


>Whatever the OutgoingRunner is stuck on, it's definitely stuck. It won't
>go away unless I do a kill -9
>
>> 
>> >Even more strange when I reboot the machine, I'll see 2 entire sets of
>> >mailman processes, almost as if the mailman start had been called twice.
>> 
>> 
>> It seems like you have two init scripts for Mailman.
>> 
>I would have thought so... but there's only 1 script under /etc/init.d/
>and no other scripts there reference mailman


Well, something is starting Mailman twice.


>> >In any case, right now it seems that mailman has stopped accepting
>> >posts. Is there a way to get more debug from mailman when "mailman post"
>> >is called? I don't see any errors, yet I don't see the post log file
>> >updating.
>> 
>> 
>> "mailman post" just puts the message in the in/ queue. I assume from
>> what you say above that it gets processed by IncomingRunner and even
>> archived and the problem is in OutgoingRunner.
>> 
>> See the FAQs at <http://wiki.list.org/x/A4E9> and
>> <http://wiki.list.org/x/-IA9>.
>> 
>
>I can see that the in queue is probably working. There are currently 129
>files in the out/ queue.
>
>There are 2 OutgoingRunner processes and apparently BOTH of them are
>doing something because strace does show some activity


Since OutgoingRunner is not sliced, there should be only one
OutgoingRunner process. This needs to be corrected. See
<http://wiki.list.org/x/_4A9>.


>Process 14078 attached - interrupt to quit
>recvfrom(7, "250 Ok. 0000000049303165.000045C"..., 8192, 0, NULL, NULL)
>= 35
>sendto(7, "mail FROM:<[EMAIL PROTECTED]"..., 65, 0, NULL, 0) = 65
>recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(7, "rcpt TO:<[EMAIL PROTECTED]"..., 35, 0, NULL, 0) = 35
>recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(7, "rcpt TO:<[EMAIL PROTECTED]"..., 42, 0, NULL, 0) = 42
>recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(7, "rcpt TO:<[EMAIL PROTECTED]>\r\n", 31, 0, NULL, 0) = 31
>recvfrom(7, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(7, "data\r\n", 6, 0, NULL, 0)    = 6
>recvfrom(7, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(7, "Received: from sr05-01.mta.terra"..., 8256, 0, NULL, 0) =
>8256
>recvfrom(7, "250 Ok. 0000000049303172.000045D"..., 8192, 0, NULL, NULL)
>= 35
>
>
>[EMAIL PROTECTED]:/var/lib/mailman# strace -p13800
>Process 13800 attached - interrupt to quit
>recvfrom(8, "250 Ok. 00000000493031A7.0000464"..., 8192, 0, NULL, NULL)
>= 35
>sendto(8, "mail FROM:<[EMAIL PROTECTED]"..., 65, 0, NULL, 0) = 65
>recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(8, "rcpt TO:<[EMAIL PROTECTED]>\r\n", 29, 0, NULL, 0) = 29
>recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(8, "rcpt TO:<[EMAIL PROTECTED]>\r\n", 31, 0, NULL, 0) = 31
>recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(8, "rcpt TO:<[EMAIL PROTECTED]"..., 43, 0, NULL, 0) = 43
>recvfrom(8, "250 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>sendto(8, "data\r\n", 6, 0, NULL, 0)    = 6
>recvfrom(8, "354 Ok.\r\n", 8192, 0, NULL, NULL) = 9
>
>
>Yet even though it's being processed, the logs/post file isn't getting
>updated, and the number of messages in out/ doesn't decrease.


So OutgoingRunner never finishes delivery of even one post.  However, I
see 'data' commands in the traces above, so presumably, some
recipients are being delivered. Then when you kill OutgoingRunner and
restart it, It recovers the .bak file from the queue and redelivers to
the same recipients who by now have received multiple copies of the
message.


>Could this be because mailman is processing a very large list (20,000
>members) and it is just stuck on processing one message, while the other
>messages wait around?


It could be, but this would indicate that you need to do something to
allow SMTP between Mailman and the MTA to proceed faster. Search the
FAQ for "tuning".
>But I've been handling these lists for a long time and never had these
>problems. Mailman doesn't seem to be getting much cpu usage. 
>
>

-- 
Mark Sapiro <[EMAIL PROTECTED]>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

------------------------------------------------------
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9

Reply via email to