On Sat, 27 Oct 2001, Jay Banda wrote:

> Unfortunately , over the past 2 days , I seem to be having a problem
> with my mail server.  We are runing Slackware 7.1 , with sendmail
> 8.11.4 , procmail 3.22 and qpopper 4.0.3.
>
> What we are seeing is that when sendmail tries to deliver mail to all
> our users at once ( when we issue a blanket email to our subscirbers
> ), it will deliver half of the messages , and then give up with the
> message
>
> "timeout waiting for input from local during Draining Input"

That is a Sendmail issue.  See
http://www.sendmail.org/~ca/email/smenhanced.html a brief explanation of
the "Draining Input" syslog message.  It may or may not apply to your
situation.

How many addresses are on the recipient list for these mass mailings?
What distribution mechanism do you use?

If you're using mailing list management software (Majordomo, Listproc, or
similar), this shouldn't happen.  However it can easily happen if you are
using a plain alias list (in /etc/mail/aliases or your local equivalent)
to deliver to your subscribers, and if your subscriber list is fairly
large.  I used to see it occasionally under a certain combination of
circumstances:  When sending to a group of nested alias lists totalling
about 800 addresses *AND* cpu usage was already high *AND* there was a lot
of disk I/O occurring on the disk that contains the mail spool.
Switching my large lists to Majordomo control solved the problem.  I use a
majordomo mailing list comprised of almost 9000 local addresses to send
info to my users, and I never see a "Draining Input" message resulting
from it.

Alternately, if all your users use POP clients to read their mail, why not
use QPopper's "POP Bulletin" feature instead of a mailing list?  This
writes the message to the user's mailbox only when he/she makes a POP
connection, so the load is spread out over a much greater time period.

> Meanwhile , qpopper sessions for the mailboxes that have NOT received
> the mailing yet begin to go into a "dead" state, and do not allow
> anyone to login. These sessions cannot be killed, and they increase as
> each person tries to pickup mail , until the cpu load average reported
> by the system causes the MTA to begin rejecting connections (
> approximately 10 mins )
>
> When the server gets to this state , it is nearly impossible to
> list the files in the mail spool directories ( it is still possible to
> move around the rest of the file system ).

It's difficult to tell without more information, but I suspect you're
running into a disk I/O bottleneck situation.  Consider this:  It's common
in a traditional UNIX file system layout for logging, mailbox storage, and
Sendmail spooling to occur on the same disk (in /var/log/syslog, /var/mail
& /var/spool/mqueue, for example).  If you add QPopper and use the same
disk (let's say /var/spool/poptmp) for its temp files, you've added
another potentially I/O intensive activity to the same disk.

If your setup is like that, when you send your mass mailing, your MTA
(sendmail) is probably writing it to and then reading it from
/var/spool/mqueue, and logging each action to /var/log/syslog.  Your MDA
(procmail) is writing each copy of the message to /var/mail/<username>.
At the same time, for each connected POP client, QPopper is reading from
/var/mail/<username> and writing to /var/spool/poptmp/.<username>.pop.
If your subscriber list is large, that's a lot of disk I/O concentrated
into a relatively small amout of time.

Do you have a version of "top" that shows how much processor time is spent
waiting for disk I/O?  If when you send one of your mass mailings the
processor idle time drops to near zero and the I/O Wait percentage goes
very high, then you've got a disk I/O bottleneck.

How you solve the disk I/O bottleneck depends on what your current disk
setup is like.  If you have slow disks and a slow disk controller, then an
upgrade to faster hardware might help.  Alternately, rearrange your disk
space (and mount points) so /var/spool and /var/mail (or the equivalents
on your machine)  are on separate disks from the reast of /var (and if
possible, separate disk controllers).  That's what I did, and it made a
tremendous difference in mail system performance in general and QPopper
performance specifically.

Or, as mentioned above, switch to QPopper's "POP Bulletin" method to
spread out the load.  Unfortunately that isn't practical on my system
because it doesn't work for IMAP users, or for those who use Pine, Elm,
etc in the UNIX shell.

-- 
Chip Old (Francis E. Old)             E-Mail:  [EMAIL PROTECTED]
Manager, BCPL Network Services        Phone:   410-887-6180
Manager, BCPL.NET Internet Services   FAX:     410-887-2091
320 York Road
Towson, MD 21204  USA

Reply via email to