Re: Speeding up Local Delivery

2010-01-14 Thread Wendigo Thompson
Victor:

  I wanted to use Postfix as a delivery mechanism so that I would be
able to queue and hold mail during database outages and other
maintenance -- a job that it does very well.  Is there really no
advice available on how to increase the speed of Postfix's local mail
delivery?

Thanks,
Wendy

On Wed, Jan 13, 2010 at 8:45 AM, Victor Duchovni
victor.ducho...@morganstanley.com wrote:
 On Wed, Jan 13, 2010 at 12:35:19AM -0600, Wendigo Thompson wrote:

 Postfix accepts mail from the
 corporate mail server and delivers the message via a pipe alias to an
 application that is then inserting the message into the database.

 Your choice of delivery mechanism is unfortunate. It is far better
 to deliver to a persistent daemon that talks SMTP or LMTP, you get
 much lower latency, I/O and CPU impact.

 Don't use local(8) for high volume traffic to a single mailbox, or
 to fork/exec scripts.

 --
        Viktor.

 Disclaimer: off-list followups get on-list replies or get ignored.
 Please do not ignore the Reply-To header.

 To unsubscribe from the postfix-users list, visit
 http://www.postfix.org/lists.html or click the link below:
 mailto:majord...@postfix.org?body=unsubscribe%20postfix-users

 If my response solves your problem, the best way to thank me is to not
 send an it worked, thanks follow-up. If you must respond, please put
 It worked, thanks in the Subject so I can delete these quickly.



Re: Speeding up Local Delivery

2010-01-14 Thread Wietse Venema
Wendigo Thompson:
 Victor:
 
   I wanted to use Postfix as a delivery mechanism so that I would be
 able to queue and hold mail during database outages and other
 maintenance -- a job that it does very well.  Is there really no
 advice available on how to increase the speed of Postfix's local mail
 delivery?

I'm the guy who wrote Postfix.

With your numbers, 300k messsages/day and 3ms/delivery, the whole
day's volume should be done in about 900s. Since you see deliveries
draining into the weekend, your numbers don't seem to add up.

In the Postfix logs you can see how long it took to deliver mail.
Here is an example of a pipe alias that handles all my own email:

Jan 14 00:08:51 spike postfix/local[98331]: 9F53A1F3EA5:
to=xxx, relay=local, delay=0.41,
delays=0.37/0.02/0/0.03, dsn=2.0.0, status=sent (delivered
to command: yyy)

Here, it took 0.03 seconds to deliver a small message for xxx to
a small program yyy. In addition it took 0.37s+0.02=0.39s to
receive the message and schedule it for delivery. This message
arrived via a slow connection. More information about this is at
http://www.postfix.org/postconf.5.html#delay_logging_resolution_limit

Now, If you want to deliver hundreds of thousands each day to a
pipe alias, then executing a whole program each time is expensive,
but it is possible if your machine has enough horse power.

So, we have two options:

a) The red option: to waste more resources, increase concurrency,
by increasing local_destination_concurrency_limit (default: 2).

b) The green option: instead of wasting CPU cycles with a local(8)
pipe alias that executes and terminates one program per delivery,
use transport_maps to direct mail to a persistent SMTP or LMTP
process that keeps the database connection open. This is the
option that Victor suggested.

Wietse


Re: Speeding up Local Delivery

2010-01-14 Thread Wendigo Thompson
Wietse:

  I know who you are -- big fan of yours ever since tcpwrappers :-)

  The only reason I wrote in is that this queueing of mail issue
appeared fairly recently, and other times when I have had to collect
large volumes of mail from this client my system has been fine.  Below
I have pasted some logs that show long delivery times (sitting in the
queue for a day) but the last delay (delivery delay?) is in the
upper-teens.  Just trying to reconcile why this is: if it's something
on my end that is fine and I can work it out, but maybe it is an
unforeseen issue with Postfix and large volumes of piped mail.  The
system itself is a Nehalem Xserve, so it seems to have plenty of
horsepower to handle the load.

(anonymized a bit in hostname and email address)
Jan 14 15:45:43 server001 postfix/local[24570]: 26E371265ED:
to=receiver_em...@receiver_ip, relay=local, delay=81352,
delays=0.59/81343/0/7.7, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:43 server001 postfix/local[22292]: AFAF31265EA:
to=receiver_em...@receiver_ip, relay=local, delay=81353,
delays=0.5/81344/0/8.2, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:45 server001 postfix/local[23365]: B811A1265E3:
to=receiver_em...@receiver_ip, relay=local, delay=81355,
delays=0.34/81341/0/13, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:45 server001 postfix/local[23852]: E5F061265EC:
to=receiver_em...@receiver_ip, relay=local, delay=81354,
delays=0.41/81344/0/10, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:46 server001 postfix/local[23580]: F15EA1265D5:
to=receiver_em...@receiver_ip, relay=local, delay=81358,
delays=0/81339/0/19, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:46 server001 postfix/local[22196]: 3121C1265E5:
to=receiver_em...@receiver_ip, relay=local, delay=81356,
delays=0.42/81342/0/14, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:46 server001 postfix/local[25810]: E55181265EB:
to=receiver_em...@receiver_ip, relay=local, delay=81355,
delays=0.19/81344/0/11, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:46 server001 postfix/local[25651]: 8918B1265E2:
to=receiver_em...@receiver_ip, relay=local, delay=81356,
delays=0/81341/0/15, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:46 server001 postfix/local[24774]: B6F851265D2:
to=receiver_em...@receiver_ip, relay=local, delay=81358,
delays=0/81337/0/21, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:47 server001 postfix/local[25378]: 26EDF1265EE:
to=receiver_em...@receiver_ip, relay=local, delay=81355,
delays=0.57/81343/0/11, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:48 server001 postfix/local[23204]: DC9B11265E4:
to=receiver_em...@receiver_ip, relay=local, delay=81358,
delays=0.32/81342/0/16, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:49 server001 postfix/local[24247]: 80DB71265E0:
to=receiver_em...@receiver_ip, relay=local, delay=81359,
delays=0.21/81341/0/18, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:49 server001 postfix/local[25905]: 840B11265FC:
to=receiver_em...@receiver_ip, relay=local, delay=81355,
delays=0.35/81348/0/7, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:49 server001 postfix/local[24394]: 302EA1265DD:
to=receiver_em...@receiver_ip, relay=local, delay=81360,
delays=0.12/81339/0/21, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)
Jan 14 15:45:50 server001 postfix/local[23554]: 4B37E1265DF:
to=receiver_em...@receiver_ip, relay=local, delay=81360,
delays=0.09/81340/0/20, dsn=2.0.0, status=sent (delivered to command:
/usr/bin/db_capture)

Warmest Regards,
Wendy

On Thu, Jan 14, 2010 at 12:34 PM, Wietse Venema wie...@porcupine.org wrote:
 Wendigo Thompson:
 Victor:

   I wanted to use Postfix as a delivery mechanism so that I would be
 able to queue and hold mail during database outages and other
 maintenance -- a job that it does very well.  Is there really no
 advice available on how to increase the speed of Postfix's local mail
 delivery?

 I'm the guy who wrote Postfix.

 With your numbers, 300k messsages/day and 3ms/delivery, the whole
 day's volume should be done in about 900s. Since you see deliveries
 draining into the weekend, your numbers don't seem to add up.

 In the Postfix logs you can see how long it took to deliver mail.
 Here is an example of a pipe alias that handles all my own email:

    Jan 14 00:08:51 spike postfix/local[98331]: 9F53A1F3EA5:
        to=xxx, relay=local, delay=0.41,
        delays=0.37/0.02/0/0.03, dsn=2.0.0, status=sent (delivered
        to command: yyy)

 Here, it took 0.03 seconds to deliver a small message for xxx to
 a small program yyy. In addition it took 0.37s+0.02=0.39s to
 

Re: Speeding up Local Delivery

2010-01-14 Thread Wietse Venema
Wendigo Thompson:
 Wietse:
 
   I know who you are -- big fan of yours ever since tcpwrappers :-)

Hah!

(As for my claim, of course other people also wrote parts of
Postfix, in particular Victor has been a major contributor).

   The only reason I wrote in is that this queueing of mail issue
 appeared fairly recently, and other times when I have had to collect
 large volumes of mail from this client my system has been fine.  Below
 I have pasted some logs that show long delivery times (sitting in the
 queue for a day) but the last delay (delivery delay?) is in the
 upper-teens.  Just trying to reconcile why this is: if it's something
 on my end that is fine and I can work it out, but maybe it is an
 unforeseen issue with Postfix and large volumes of piped mail.  The
 system itself is a Nehalem Xserve, so it seems to have plenty of
 horsepower to handle the load.
 
 (anonymized a bit in hostname and email address)
 Jan 14 15:45:43 server001 postfix/local[24570]: 26E371265ED:
 to=receiver_em...@receiver_ip, relay=local, delay=81352,
 delays=0.59/81343/0/7.7, dsn=2.0.0, status=sent (delivered to command:
 /usr/bin/db_capture)

This looks problematic: it took 7.7s from the time that the Postfix
local delivery agent was tasked to deliver the message, to the time
that the db_capture command reported completion.

Plus, these messages had already been sitting in the queue for
almost 24 hours, so the problem will have started much earlier than
that.

 delays=0.5/81344/0/8.2, dsn=2.0.0, ...
 delays=0.34/81341/0/13, dsn=2.0.0, ...
 delays=0.41/81344/0/10, dsn=2.0.0, ...
 delays=0/81339/0/19, dsn=2.0.0, ...
 delays=0.42/81342/0/14, dsn=2.0.0, ...
 delays=0.19/81344/0/11, dsn=2.0.0, ...
 delays=0/81341/0/15, dsn=2.0.0, ...
 delays=0/81337/0/21, dsn=2.0.0, ...
 delays=0.57/81343/0/11, dsn=2.0.0, ...
 delays=0.32/81342/0/16, dsn=2.0.0, ...
 delays=0.21/81341/0/18, dsn=2.0.0, ...
 delays=0.35/81348/0/7, dsn=2.0.0, ...
 delays=0.12/81339/0/21, dsn=2.0.0, ...
 delays=0.09/81340/0/20, dsn=2.0.0, ...

Now the question is why did /usr/bin/db_capture take so long? I am
not sure that my red and green options are going to solve your
performance problem.

If you still have the logs, it will be worthwhile to find out when
the delivery delay started climbing through the roof.

Wietse


Re: Speeding up Local Delivery

2010-01-14 Thread Wendigo Thompson
Wietse:

  Sadly I generally don't keep logs at this granular of a level
(mail.crit is what I usually store forever) but I can turn logging
back on after the queue empties over the weekend and see if the delay
is persisting.  I know the command itself is completing quickly -- it
seems like something in between is causing a delay.  Do you have any
suggestions for how I could profile the handoff from local to the
executable?

Thanks again,
Wendy

On Thu, Jan 14, 2010 at 4:13 PM, Wietse Venema wie...@porcupine.org wrote:
 Wendigo Thompson:
 Wietse:

   I know who you are -- big fan of yours ever since tcpwrappers :-)

 Hah!

 (As for my claim, of course other people also wrote parts of
 Postfix, in particular Victor has been a major contributor).

   The only reason I wrote in is that this queueing of mail issue
 appeared fairly recently, and other times when I have had to collect
 large volumes of mail from this client my system has been fine.  Below
 I have pasted some logs that show long delivery times (sitting in the
 queue for a day) but the last delay (delivery delay?) is in the
 upper-teens.  Just trying to reconcile why this is: if it's something
 on my end that is fine and I can work it out, but maybe it is an
 unforeseen issue with Postfix and large volumes of piped mail.  The
 system itself is a Nehalem Xserve, so it seems to have plenty of
 horsepower to handle the load.

 (anonymized a bit in hostname and email address)
 Jan 14 15:45:43 server001 postfix/local[24570]: 26E371265ED:
 to=receiver_em...@receiver_ip, relay=local, delay=81352,
 delays=0.59/81343/0/7.7, dsn=2.0.0, status=sent (delivered to command:
 /usr/bin/db_capture)

 This looks problematic: it took 7.7s from the time that the Postfix
 local delivery agent was tasked to deliver the message, to the time
 that the db_capture command reported completion.

 Plus, these messages had already been sitting in the queue for
 almost 24 hours, so the problem will have started much earlier than
 that.

 delays=0.5/81344/0/8.2, dsn=2.0.0, ...
 delays=0.34/81341/0/13, dsn=2.0.0, ...
 delays=0.41/81344/0/10, dsn=2.0.0, ...
 delays=0/81339/0/19, dsn=2.0.0, ...
 delays=0.42/81342/0/14, dsn=2.0.0, ...
 delays=0.19/81344/0/11, dsn=2.0.0, ...
 delays=0/81341/0/15, dsn=2.0.0, ...
 delays=0/81337/0/21, dsn=2.0.0, ...
 delays=0.57/81343/0/11, dsn=2.0.0, ...
 delays=0.32/81342/0/16, dsn=2.0.0, ...
 delays=0.21/81341/0/18, dsn=2.0.0, ...
 delays=0.35/81348/0/7, dsn=2.0.0, ...
 delays=0.12/81339/0/21, dsn=2.0.0, ...
 delays=0.09/81340/0/20, dsn=2.0.0, ...

 Now the question is why did /usr/bin/db_capture take so long? I am
 not sure that my red and green options are going to solve your
 performance problem.

 If you still have the logs, it will be worthwhile to find out when
 the delivery delay started climbing through the roof.

        Wietse



Re: Speeding up Local Delivery

2010-01-14 Thread Wietse Venema
Wendigo Thompson:
 Wietse:
 
   Sadly I generally don't keep logs at this granular of a level
 (mail.crit is what I usually store forever) but I can turn logging
 back on after the queue empties over the weekend and see if the delay
 is persisting.  I know the command itself is completing quickly -- it
 seems like something in between is causing a delay.  Do you have any
 suggestions for how I could profile the handoff from local to the
 executable?

The fourth delays timer, which show 7.7s delays and more,
stops as soon as your /usr/bin/db_capture terminates.

Perhaps it is doing hostname lookup and the resolv.conf
or nsswitch.conf files are out of whack. Perhaps it is
taking forever to log in.

To investigate, configure Postfix to run it under
strace (Linux) truss (Solaris) or other.

In your alias:
|strace -tt -o /tmp/db_capture. /usr/bin/db_capture

(I think you need to specify four  to get two, which
is what I want here, so that each /tmp/db_capture file
is tagged with the process ID).

Wietse


 Thanks again,
 Wendy
 
 On Thu, Jan 14, 2010 at 4:13 PM, Wietse Venema wie...@porcupine.org wrote:
  Wendigo Thompson:
  Wietse:
 
  ? I know who you are -- big fan of yours ever since tcpwrappers :-)
 
  Hah!
 
  (As for my claim, of course other people also wrote parts of
  Postfix, in particular Victor has been a major contributor).
 
  ? The only reason I wrote in is that this queueing of mail issue
  appeared fairly recently, and other times when I have had to collect
  large volumes of mail from this client my system has been fine. ?Below
  I have pasted some logs that show long delivery times (sitting in the
  queue for a day) but the last delay (delivery delay?) is in the
  upper-teens. ?Just trying to reconcile why this is: if it's something
  on my end that is fine and I can work it out, but maybe it is an
  unforeseen issue with Postfix and large volumes of piped mail. ?The
  system itself is a Nehalem Xserve, so it seems to have plenty of
  horsepower to handle the load.
 
  (anonymized a bit in hostname and email address)
  Jan 14 15:45:43 server001 postfix/local[24570]: 26E371265ED:
  to=receiver_em...@receiver_ip, relay=local, delay=81352,
  delays=0.59/81343/0/7.7, dsn=2.0.0, status=sent (delivered to command:
  /usr/bin/db_capture)
 
  This looks problematic: it took 7.7s from the time that the Postfix
  local delivery agent was tasked to deliver the message, to the time
  that the db_capture command reported completion.
 
  Plus, these messages had already been sitting in the queue for
  almost 24 hours, so the problem will have started much earlier than
  that.
 
  delays=0.5/81344/0/8.2, dsn=2.0.0, ...
  delays=0.34/81341/0/13, dsn=2.0.0, ...
  delays=0.41/81344/0/10, dsn=2.0.0, ...
  delays=0/81339/0/19, dsn=2.0.0, ...
  delays=0.42/81342/0/14, dsn=2.0.0, ...
  delays=0.19/81344/0/11, dsn=2.0.0, ...
  delays=0/81341/0/15, dsn=2.0.0, ...
  delays=0/81337/0/21, dsn=2.0.0, ...
  delays=0.57/81343/0/11, dsn=2.0.0, ...
  delays=0.32/81342/0/16, dsn=2.0.0, ...
  delays=0.21/81341/0/18, dsn=2.0.0, ...
  delays=0.35/81348/0/7, dsn=2.0.0, ...
  delays=0.12/81339/0/21, dsn=2.0.0, ...
  delays=0.09/81340/0/20, dsn=2.0.0, ...
 
  Now the question is why did /usr/bin/db_capture take so long? I am
  not sure that my red and green options are going to solve your
  performance problem.
 
  If you still have the logs, it will be worthwhile to find out when
  the delivery delay started climbing through the roof.
 
  ? ? ? ?Wietse
 
 
 



Re: Speeding up Local Delivery

2010-01-14 Thread Wendigo Thompson
Okay, I have some updates.  Reducing the concurrency down (and
ultimately to 1) increased performance: it seems the delays I was
seeing were related to concurrency inside SQL.  However, I am still
seeing half second delays in local delivery while my insertion
application is showing ~.003s processing times.  I was able to verify
that it's not my insertion application by making a simple script that
soft-bounces mail (exit code 75) and does nothing else -- there's
still a half a second delay on action.  Tracing the individual
processes is proving to be a challenge as this is an Apple system and
dtruss can only be run as root.  Any tips?

Jan 14 18:30:58 x postfix/local[27253]: 03B2A13E351: to=xxx,
relay=local, delay=40172, delays=37592/2580/0/0.51, dsn=4.3.0,
status=deferred (temporary failure)
Jan 14 18:30:59 x postfix/local[27253]: 03B31132541: to=xxx,
relay=local, delay=70394, delays=67813/2581/0/0.53, dsn=4.3.0,
status=deferred (temporary failure)
Jan 14 18:30:59 x postfix/local[27253]: 03B371416A7: to=xxx,
relay=local, delay=36726, delays=34144/2581/0/0.53, dsn=4.3.0,
status=deferred (temporary failure)
Jan 14 18:31:00 x postfix/local[27253]: 03B3714401A: to=xxx,
relay=local, delay=34562, delays=31980/2582/0/0.54, dsn=4.3.0,
status=deferred (temporary failure)


On Thu, Jan 14, 2010 at 4:47 PM, Wietse Venema wie...@porcupine.org wrote:
 Wendigo Thompson:
 Wietse:

   Sadly I generally don't keep logs at this granular of a level
 (mail.crit is what I usually store forever) but I can turn logging
 back on after the queue empties over the weekend and see if the delay
 is persisting.  I know the command itself is completing quickly -- it
 seems like something in between is causing a delay.  Do you have any
 suggestions for how I could profile the handoff from local to the
 executable?

 The fourth delays timer, which show 7.7s delays and more,
 stops as soon as your /usr/bin/db_capture terminates.

 Perhaps it is doing hostname lookup and the resolv.conf
 or nsswitch.conf files are out of whack. Perhaps it is
 taking forever to log in.

 To investigate, configure Postfix to run it under
 strace (Linux) truss (Solaris) or other.

 In your alias:
    |strace -tt -o /tmp/db_capture. /usr/bin/db_capture

 (I think you need to specify four  to get two, which
 is what I want here, so that each /tmp/db_capture file
 is tagged with the process ID).

        Wietse


 Thanks again,
 Wendy

 On Thu, Jan 14, 2010 at 4:13 PM, Wietse Venema wie...@porcupine.org wrote:
  Wendigo Thompson:
  Wietse:
 
  ? I know who you are -- big fan of yours ever since tcpwrappers :-)
 
  Hah!
 
  (As for my claim, of course other people also wrote parts of
  Postfix, in particular Victor has been a major contributor).
 
  ? The only reason I wrote in is that this queueing of mail issue
  appeared fairly recently, and other times when I have had to collect
  large volumes of mail from this client my system has been fine. ?Below
  I have pasted some logs that show long delivery times (sitting in the
  queue for a day) but the last delay (delivery delay?) is in the
  upper-teens. ?Just trying to reconcile why this is: if it's something
  on my end that is fine and I can work it out, but maybe it is an
  unforeseen issue with Postfix and large volumes of piped mail. ?The
  system itself is a Nehalem Xserve, so it seems to have plenty of
  horsepower to handle the load.
 
  (anonymized a bit in hostname and email address)
  Jan 14 15:45:43 server001 postfix/local[24570]: 26E371265ED:
  to=receiver_em...@receiver_ip, relay=local, delay=81352,
  delays=0.59/81343/0/7.7, dsn=2.0.0, status=sent (delivered to command:
  /usr/bin/db_capture)
 
  This looks problematic: it took 7.7s from the time that the Postfix
  local delivery agent was tasked to deliver the message, to the time
  that the db_capture command reported completion.
 
  Plus, these messages had already been sitting in the queue for
  almost 24 hours, so the problem will have started much earlier than
  that.
 
  delays=0.5/81344/0/8.2, dsn=2.0.0, ...
  delays=0.34/81341/0/13, dsn=2.0.0, ...
  delays=0.41/81344/0/10, dsn=2.0.0, ...
  delays=0/81339/0/19, dsn=2.0.0, ...
  delays=0.42/81342/0/14, dsn=2.0.0, ...
  delays=0.19/81344/0/11, dsn=2.0.0, ...
  delays=0/81341/0/15, dsn=2.0.0, ...
  delays=0/81337/0/21, dsn=2.0.0, ...
  delays=0.57/81343/0/11, dsn=2.0.0, ...
  delays=0.32/81342/0/16, dsn=2.0.0, ...
  delays=0.21/81341/0/18, dsn=2.0.0, ...
  delays=0.35/81348/0/7, dsn=2.0.0, ...
  delays=0.12/81339/0/21, dsn=2.0.0, ...
  delays=0.09/81340/0/20, dsn=2.0.0, ...
 
  Now the question is why did /usr/bin/db_capture take so long? I am
  not sure that my red and green options are going to solve your
  performance problem.
 
  If you still have the logs, it will be worthwhile to find out when
  the delivery delay started climbing through the roof.
 
  ? ? ? ?Wietse
 






Re: Speeding up Local Delivery

2010-01-14 Thread Stan Hoeppner
Wendigo Thompson put forth on 1/14/2010 6:39 PM:
 Okay, I have some updates.  Reducing the concurrency down (and
 ultimately to 1) increased performance: it seems the delays I was
 seeing were related to concurrency inside SQL.  However, I am still
 seeing half second delays in local delivery while my insertion
 application is showing ~.003s processing times.  I was able to verify
 that it's not my insertion application by making a simple script that
 soft-bounces mail (exit code 75) and does nothing else -- there's
 still a half a second delay on action.  Tracing the individual
 processes is proving to be a challenge as this is an Apple system and
 dtruss can only be run as root.  Any tips?

Not specifically.  I was just going over the numbers, and with your previous
configuration, before the current tweaking, the db process was consuming each
email in about 10-20 seconds, which is extremely long but for large PDFs, Word
docs, or binaries.  Lets say the average was 10 seconds.  If concurrency was 1,
that comes out to ~34 days to consume 300,000 emails, if my math is correct, at
86,400 seconds per day.  Obviously your concurrency was a little higher than 1. 
;)

 Jan 14 18:30:58 x postfix/local[27253]: 03B2A13E351: to=xxx,
 relay=local, delay=40172, delays=37592/2580/0/0.51, dsn=4.3.0,
 status=deferred (temporary failure)
 Jan 14 18:30:59 x postfix/local[27253]: 03B31132541: to=xxx,
 relay=local, delay=70394, delays=67813/2581/0/0.53, dsn=4.3.0,
 status=deferred (temporary failure)
 Jan 14 18:30:59 x postfix/local[27253]: 03B371416A7: to=xxx,
 relay=local, delay=36726, delays=34144/2581/0/0.53, dsn=4.3.0,
 status=deferred (temporary failure)
 Jan 14 18:31:00 x postfix/local[27253]: 03B3714401A: to=xxx,
 relay=local, delay=34562, delays=31980/2582/0/0.54, dsn=4.3.0,
 status=deferred (temporary failure)

It appears you've now dropped the db process consumption time down from ~10
seconds per email to about .5 seconds per email, a speedup of ~20:1.  All things
being equal, we could assume you'll start seeing messages sitting in the queue
for a little over an hour now instead of a day, once things settle down.  This
is still pretty high.

Due you have the postfix queue and the db files spread across more than one
spindle?  I ask because the xServe doesn't have an inbuilt RAID controller
(optional card), and only has 3 disk bays for slow (7200rpm) SATA disks.

The db insertion time seems to hint that disk throughput may not be an issue for
the db subsystem.  If you have the db files sitting on fast external RAID or SAN
storage, but have the postfix queue on the single system boot disk along with a
page file partition, etc, this may explain why you're sync'ing so quickly to db
yet it takes so long to pipe the emails to the db insertion app.

Your queue disk may not be fast enough for the load being generated on it.  This
one disk is trying to pump ~600,000 messages/day (in+out), and most of that will
peak during two or three 1-2 hour intervals during the business day--arrival,
after lunch, end of day.

-- 
Stan


Re: Speeding up Local Delivery

2010-01-14 Thread Wietse Venema
Wendigo Thompson:
 Okay, I have some updates.  Reducing the concurrency down (and
 ultimately to 1) increased performance: it seems the delays I was
 seeing were related to concurrency inside SQL.  However, I am still
 seeing half second delays in local delivery while my insertion
 application is showing ~.003s processing times.  I was able to verify
 that it's not my insertion application by making a simple script that
 soft-bounces mail (exit code 75) and does nothing else -- there's
 still a half a second delay on action.  Tracing the individual
 processes is proving to be a challenge as this is an Apple system and
 dtruss can only be run as root.  Any tips?

It is perhaps good to be aware that my support is limited to Postfix.
It ends once Postfix hands off the message to another program, be it
over a network or on the same machine.

You need to determine why /usr/bin/db_capture is taking a long time.

Run /usr/bin/db_capture by hand, with the same command-line arguments
as when it is invoked by Postfix, and with a prepared email message
from file.

/usr/bin/db_capture arguments  prepared-message-file

Where the prepared-message-file has the form that Postfix emits:

From u...@example.com Fri Jan  1 05:12:42 2010
From: u...@example.com
To: some...@somewhere
Subject: blah blah

More blah.
More blah.
More blah.

You can use any debug tool that you like. Good luck.

Wietse


Re: Speeding up Local Delivery

2010-01-13 Thread Victor Duchovni
On Wed, Jan 13, 2010 at 12:35:19AM -0600, Wendigo Thompson wrote:

 Postfix accepts mail from the
 corporate mail server and delivers the message via a pipe alias to an
 application that is then inserting the message into the database.

Your choice of delivery mechanism is unfortunate. It is far better
to deliver to a persistent daemon that talks SMTP or LMTP, you get
much lower latency, I/O and CPU impact.

Don't use local(8) for high volume traffic to a single mailbox, or
to fork/exec scripts.

-- 
Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the Reply-To header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
mailto:majord...@postfix.org?body=unsubscribe%20postfix-users

If my response solves your problem, the best way to thank me is to not
send an it worked, thanks follow-up. If you must respond, please put
It worked, thanks in the Subject so I can delete these quickly.


Speeding up Local Delivery

2010-01-12 Thread Wendigo Thompson
Hello:

  I maintain for a large client a Postfix/MySQL installation that
archives certain messages required to be held for compliance with
lawsuits and employment litigation.  Postfix accepts mail from the
corporate mail server and delivers the message via a pipe alias to an
application that is then inserting the message into the database.
Recently for a large litigation matter, all mail at this company (300k
messages/day) was required to be held.  The server is mostly doing
fine, but I have noticed that during peak delivery times Postfix will
queue large amounts of mail and be slow at delivering it to the
script/database -- sometimes queueing mail all the way to the weekend,
when things are less busy.  I've watched the time it takes from
Postfix opening the application and sending the data and each message
is taking ~0.003 seconds.  It seems that if I could coax Postfix into
sending more messages to that pipe I wouldn't have queueing issues
during work days.

  I've tuned the concurrency_limits from their defaults upward in
steps of 5 (up to 36 now) without much luck -- I only ever seem to be
able to achieve ~5 deliveries per second. I'd appreciate any guidance
the gurus have to offer.  FWIW, it's 2.4.3 on Mac OS X Server 10.5.

Thanks,
Wendy