Viktor Dukhovni: > On Mon, May 13, 2013 at 06:55:12AM -0400, Wietse Venema wrote: > > > Viktor Dukhovni: > > > The reasonable response to latency spikes is creating concurrency > > > spikes. > > > > By design, Postfix MUST be able to run in a fixed resource budget. > > Your on-demand concurrency spikes break this principle and will > > result in unexpected resource exhaustion. > > No, there are two different process limits one for non-slow deliveries,
No. It is a mistake to have an overload resource budget that is different for different kinds of overload. This is fundamental to the design of Postfix. Resources are not just memory but also file handles, protocol blocks, process slots, and so on. The overload resource budget must be easy to validate with tools like smtp-source/sink and the like: just max out the number of connections and verify that things don't come crashing down (I have to admit that postscreen(8) complicates this a little; one may have to disable its cache temporarily to perform full validation). If the overload resource budget depends on overload context, then it becomes non-trivial to validate, and one introduces an unexpected failure mode into Postfix. For example, I don't need an MTA that works under all conditions as long as the network is up, but that comes crashing down when the network hiccups for a minute, just because the MTA decides to run 6x the number of SMTP client processes (your example). Instead of introducing a context-dependent overload resource budget, I have a proposal that addresses the real problem (slow or non-responding DNS and SMTP servers) and that requires no changes to qmgr(8) or master(8), and minor changes to smtp(8). If we want to address the real problem: slow or non-responding DNS and SMTP servers, then we should not waste an entire SMTP client process blocking on DNS lookup and TCP connection handshake in the first place. Instead it is more efficient to interpose a prescreen(8) process between the qmgr(8) and smtp(8) processes. This process can look up DNS, create the initial TCP connection, peek() at the remote server greeting, and keep the bogons away without wasting any smtp(8) processes. Just like postscreen(8) can keep bogus SMTP clients away without wasting smtpd(8) processes. In the mean time, "stress"-like configuration can be a short-term solution to temporarily tighten timing and other limits, and to relax those limits when conditions return to normal (Patrik's scenario did not concern persistent overload). And yes, it would be a good idea to have an option to control the amount of time that smtpd(8) and other Postfix components spend on DNS lookups. Wietse