Re: dynamic process limits (Separate transport for retried recipients)

Viktor Dukhovni Tue, 14 May 2013 07:14:08 -0700

On Tue, May 14, 2013 at 08:24:16AM -0400, Wietse Venema wrote:

> Viktor Dukhovni:
> > Nothing I'm proposing creates less opportunity for delivery of new
> > mail, rather I'm proposing dynamic (up to a limit) higher concurrency
> > that soaks up a bounded amount of high latency traffic (ideally
> > all of it most of the time).
> 
> This is no better than having a static process limit at that larger
> maximum.  Your on-demand additional process slots cannot prevent
> slow mail from using up all delivery agents.


The difference is that the static larger maximum does not prevent
a thundering hurd of fast deliveries using the high limit to thrash
the network link and process scheduler.

> To prevent slow mail from using up all delivery agents, one needs
> limit the amount of slow mail in the active queue.  Once a message
> is in the active queue the queue manager has no choice. It has to
> be delivered ASAP.

My goal was not preventing congestion under all conditions, this
is simply not possible.  Once some heuristically identified mail
is substantially delayed, we've lost already, since the proposed
heuristics are rather crude.

I am proposing a means of having sustainably higher process limits,
without thrashing.  The higher process limits substantially reduce
steady-state congestion frequency.  As you said, we don't need
perfection.  Simply higher limits are a bit problematic when the
slow path is in fact full of fast mail.

> How do we limit the amount of slow mail in the active queue?

I would prefer to process it at higher concurrency, to the extent
possible, maintaining reasonable throughput even for the plausibly
slow mail, unless our predictors become much more precise.

> That
> requires prediction. We seem to agree that once mail has been
> deferred a few times, it is likey to be deferred again. We have one
> other predictor: the built-in dead-site list. That's it as far as
> I know.

Provided the reason is an unreachable destination, and not a deferred
transport, or a certificate expiration, (any fast repeated deferral
via local policy, ...)

> As for after-the-fact detection, it does not help if a process
> informs the master dynamically that it is blocked.  That is too
> late to prevent slow mail from using up all delivery agents,
> regardless of whether the process limit is dynamically increased
> up to some maximum, or whether it is frozen at that same inflated
> maximum.

The above is a misreading of intent.  It does help, it enables safe
support for higher concurrency levels, which modern hardware and
O/S combinations can easily handle.

> [detailed analysis]
> 
> Thanks. This underscores that longer maximal_backoff_time can be
> beneficial, by reducing the number of times that a delayed message
> visits the active queue. This reflects a simple heuristic: once
> mail has been deferred a few times, it is likey to be deferred
> again.

That, plus for many sites a not too aggressively reduced queue
lifetime.  Often an email delayed for more than 1 or 2 days is
effectively too late, with a bounce the sender can resend to a
better address or try another means to reach the recipient.  I
found 2 days to rather than 5 to be largely beneficial with no
complaints of lost mail because some site was down for ~3-4 days.

-- 
        Viktor.

Re: dynamic process limits (Separate transport for retried recipients)

Reply via email to