On Mon, May 13, 2013 at 09:10:13AM -0400, Wietse Venema wrote: > > No, there are two different process limits one for non-slow deliveries, > > No. It is a mistake to have an overload resource budget that is > different for different kinds of overload. This is fundamental to > the design of Postfix. Resources are not just memory but also file > handles, protocol blocks, process slots, and so on.
This is right in principle, not so much in practice. Were Postfix delivery agent concurrency tuned to the limit of local system resources, indeed one should be careful about overload, but this too is easy to test, just raise the process limit to the combined ceiling before testing. In practice the smtp(8) process limit is far below the system resource limit, the reason I don't configure 10,000 delivery agents is not lack of RAM or kernel resources. My $300 asus laptop has 4GB of RAM. Typically it is unwise to run even 1,000 parallel deliveries, because the network delays would be unfortunate. However, 1,000 parallel blocked delivery agents are not unreasonable, and I can test at that load level if I am worried about resource limits. > The overload resource budget must be easy to validate with tools > like smtp-source/sink and the like: just max out the number of > connections and verify that things don't come crashing down (I have > to admit that postscreen(8) complicates this a little; one may have > to disable its cache temporarily to perform full validation). Or just by knowing that 1,000 processes is an easy fit. > Instead of introducing a context-dependent overload resource budget, > I have a proposal that addresses the real problem (slow or > non-responding DNS and SMTP servers) and that requires no changes > to qmgr(8) or master(8), and minor changes to smtp(8). > > If we want to address the real problem: slow or non-responding DNS > and SMTP servers, then we should not waste an entire SMTP client > process blocking on DNS lookup and TCP connection handshake in the > first place. Instead it is more efficient to interpose a prescreen(8) > process between the qmgr(8) and smtp(8) processes. This process > can look up DNS, create the initial TCP connection, peek() at the > remote server greeting, and keep the bogons away without wasting > any smtp(8) processes. Just like postscreen(8) can keep bogus SMTP > clients away without wasting smtpd(8) processes. Sadly the smtp(8) delivery agent makes multiple connections, supports fallback destinations, has SASL and TLS dependent connection cache re-use barriers, ... The high latency can happen on a second connection after a fast 4XX with the first MX host, ... A prescreen would be very difficult to implement. The kernel resources of prescreen would still need to be commensurate (socket control blocks, ...) with the various smtp(8) processes I proposed. Stress dependent timers could be more realistic if we can get DNS under control, may need a new client library (ldns or similar). I am wary of aggressively low client timeouts, we could end up treading water by timeout out over and over when waiting a bit longer would get the mail through. Finally, the original proposal of parallel transports doubles or more the process concurrency (Patrick would probably tune the slow slow path with a high process limit). The same objections apply even more strongly there, since we may send fast mail down the slow path and stress the system even more. All I'm doing is allocating slow path processes on the fly, by doing it when delivery is actually slow. Think of this as 2 master.cf entries in one. You don't object to users adding master.cf entries, so there's little reason to object to implicit ones. -- Viktor.