On 11.5.2013 15:00, Wietse Venema wrote:

Some time ago I was setting up yet another postfix deployment, and I was
once again thinking about the case when (temporarily) undeliverable
recipients block most or all of the available delivery agents.

Low-level comments:

- What common use case has different per-recipient (not: per-sender,
etc.) soft reject rates for a mail stream between two sites? Does
it matter whether some portion of a mail stream between two sites
is deferred because of the recipient, sender or other cause?

The use case which I am interested in is basically some service sending registration confirmation messages to its users, where some users decide to fill in bogus addresses which result in temporary errors until the message expires and bounces. Such messages tend to stock pile in the deferred queue and can quite dominate the active queue and adversely affect the deliveries to proper recipients. Especially when these bogus recipients are not deferred immediately, but only after considerably long timeout.

I don't think the other standard scenarios of message deferral like when the network connection really goes down temporarily would benefit much from the separate transport I propose. OTOH, it shouldn't affect them terribly either.

And as for the deferral cause, I think that from the point of blocked delivery agents it doesn't really matter why exactly was the message blocked. In my experience, if something was deferred, it is more likely it will be deferred again rather than not, which seems to be enough to try to make sure it doesn't take resources away from fresh deliveries....

However, I would very much like to hear what other people think and how it might affect their setups...

- Postfix has multiple transports configured: the required ones
such as local_transport (local), default_transport (smtp),
relay_transport (relay), plus the ones that aren't selected with
$local/virtual/relay/default_transport such as retry, error, and
ad-hoc transports. It would be wrong to hard-code the "alternate
retry transport" feature to just one transport. Even if it were
used only with smtp-like transports, there may be more than one.

Definitely. In the example I gave, the smtp_retry_transport was meant to work as the usual per-transport <transport>_retry_transport fallback.

And I would not mind if you want to call it differently either, to prevent any possible confusion with the retry transport based on the error delivery agent.

- Postfix tries to play "nice" by not overwhelming remote servers
with many connections. This is scheduled per transport, not across
transports. I'm not claiming that Postfix concurrency scheduling
solves all problems, but having two transports sending to the same
destination would complicate things a little (but no more than
having sender-dependent source IP addresses).

Hmm, right. I haven't considered this explicitly. However:

- the impact on the target site doesn't seem to be worse than if the fallback_relay feature was used to deal with the problem in the first place.

- the concurrency window limit of that alternate transport can be explicitly configured to be small, which should minimize the difference of the load caused on the target site.

- in the usual case when the recipients are deferred again, the concurrency window wouldn't even grow in the first place. (In my use case there is often even no proper site to connect to as such, but that's not an argument, of course).

- the current cohort based concurrency algorithm should play nicely even with unknown number of other hosts connecting to the target site, so it shall work reasonably well with two separates transports each trying on their own as well.

I am all eager to hear what Victor has to say about this one, though... He has a lot of experience with problematic sites using small concurrency windows, from what I remember...

Configuration wise, it might work like this:

- in master.cf, clone the smtp transport, call it "slow" for example
- in main.cf, set smtp_retry_transport = slow

I forgot to explicitly mention that the parameters of the "slow" transport can be configured independently of the original smtp transport, but I guess that's obvious.

Implementation wise, the following changes would be necessary:

- when creating message structure, qmgr would need to keep track
   if it came from the incoming or deferred queue
- in qmgr_message_resolve(), just before looking up the transport,
   when the message originates from deferred queue, qmgr would
   replace the transport name in the reply with the configured retry
   variant if it is defined and such transport exists.

Hmm, or shall this perhaps be made part of trivial rewrite resolving instead? Which one would you prefer?

Patrik

Reply via email to