Wendigo Thompson:
> Wietse:
> 
>   Sadly I generally don't keep logs at this granular of a level
> (mail.crit is what I usually store forever) but I can turn logging
> back on after the queue empties over the weekend and see if the delay
> is persisting.  I know the command itself is completing quickly -- it
> seems like something in between is causing a delay.  Do you have any
> suggestions for how I could profile the handoff from local to the
> executable?

The fourth "delays" timer, which show 7.7s delays and more,
stops as soon as your /usr/bin/db_capture terminates.

Perhaps it is doing hostname lookup and the resolv.conf
or nsswitch.conf files are out of whack. Perhaps it is
taking forever to log in.

To investigate, configure Postfix to run it under
strace (Linux) truss (Solaris) or other.

In your alias:
    "|strace -tt -o /tmp/db_capture.$$$$ /usr/bin/db_capture"

(I think you need to specify four $$$$ to get two, which
is what I want here, so that each /tmp/db_capture file
is tagged with the process ID).

        Wietse


> Thanks again,
> Wendy
> 
> On Thu, Jan 14, 2010 at 4:13 PM, Wietse Venema <wie...@porcupine.org> wrote:
> > Wendigo Thompson:
> >> Wietse:
> >>
> >> ? I know who you are -- big fan of yours ever since tcpwrappers :-)
> >
> > Hah!
> >
> > (As for my claim, of course other people also wrote parts of
> > Postfix, in particular Victor has been a major contributor).
> >
> >> ? The only reason I wrote in is that this queueing of mail issue
> >> appeared fairly recently, and other times when I have had to collect
> >> large volumes of mail from this client my system has been fine. ?Below
> >> I have pasted some logs that show long delivery times (sitting in the
> >> queue for a day) but the last delay (delivery delay?) is in the
> >> upper-teens. ?Just trying to reconcile why this is: if it's something
> >> on my end that is fine and I can work it out, but maybe it is an
> >> unforeseen issue with Postfix and large volumes of piped mail. ?The
> >> system itself is a Nehalem Xserve, so it seems to have plenty of
> >> horsepower to handle the load.
> >>
> >> (anonymized a bit in hostname and email address)
> >> Jan 14 15:45:43 server001 postfix/local[24570]: 26E371265ED:
> >> to=<receiver_em...@receiver_ip>, relay=local, delay=81352,
> >> delays=0.59/81343/0/7.7, dsn=2.0.0, status=sent (delivered to command:
> >> /usr/bin/db_capture)
> >
> > This looks problematic: it took 7.7s from the time that the Postfix
> > local delivery agent was tasked to deliver the message, to the time
> > that the db_capture command reported completion.
> >
> > Plus, these messages had already been sitting in the queue for
> > almost 24 hours, so the problem will have started much earlier than
> > that.
> >
> >> delays=0.5/81344/0/8.2, dsn=2.0.0, ...
> >> delays=0.34/81341/0/13, dsn=2.0.0, ...
> >> delays=0.41/81344/0/10, dsn=2.0.0, ...
> >> delays=0/81339/0/19, dsn=2.0.0, ...
> >> delays=0.42/81342/0/14, dsn=2.0.0, ...
> >> delays=0.19/81344/0/11, dsn=2.0.0, ...
> >> delays=0/81341/0/15, dsn=2.0.0, ...
> >> delays=0/81337/0/21, dsn=2.0.0, ...
> >> delays=0.57/81343/0/11, dsn=2.0.0, ...
> >> delays=0.32/81342/0/16, dsn=2.0.0, ...
> >> delays=0.21/81341/0/18, dsn=2.0.0, ...
> >> delays=0.35/81348/0/7, dsn=2.0.0, ...
> >> delays=0.12/81339/0/21, dsn=2.0.0, ...
> >> delays=0.09/81340/0/20, dsn=2.0.0, ...
> >
> > Now the question is why did /usr/bin/db_capture take so long? I am
> > not sure that my "red" and "green" options are going to solve your
> > performance problem.
> >
> > If you still have the logs, it will be worthwhile to find out when
> > the delivery delay started climbing through the roof.
> >
> > ? ? ? ?Wietse
> >
> 
> 

Reply via email to