Re: Enabling stress detection by default
Daniel V. Reinhardt: ---Could there be a notification alert be sent via SMS or another means to the administrator of the post server in question, stating something is wrong with the server? This could be implemented by configuring a logfile monitoring program (swatch, logsurfer, etc.) to send notifications when it sees logging with the following format: smtpd[%d]: warning: service %s (%s) has reached its process limit %d: new clients may experience noticeable delays Wietse
Re: Enabling stress detection by default
Daniel V. Reinhardt wrote: ---Could there be a notification alert be sent via SMS or another means to the administrator of the post server in question, stating something is wrong with the server? The discussion is about proposed default postfix behavior, so no, out of band notifications cannot be a default. The default behavior of postfix is to put scary sounding entries in the system log when something goes wrong. It's up to the admin to review the logs and/or configure third-party system monitor software to watch for interesting events. -- Noel Jones
Enabling stress detection by default
With Postfix 2.5 I introdoced stress-dependent behavior in the SMTP server, but this was left turned off by default. I'm thinking of turning on some stress-dependent behavior by default in Postfix 2.6, to make Postfix look better in stupid benchmarks (just like in_flow_delay and smtpd_client_connection_count_limit). While I am reluctant to make the default SMTP server timeouts drastically shorter, I am inclined to ship Postfix 2.6 with settings that will be relatively safe under conditions of overload. Something that will drastically cut the time per session: smtpd_timeout = ${stress?10s}${stress:300s} smtpd_hard_error_limit = ${stress?2}${stress:20} This would affect clients that cannot send SMTP commands within 10s from each other (i.e. round-trip time 5s) or cannot send blocks of message content within 10s from each other (i.e. an effective bit rate of 1500 bits/second). Keep in mind that we're doing this only when the server is saturated. By not receiving very slow mail, we create more opportunities to receive other mail that arrives in less time. This assumes that the condition is temporary, and that some bit rates will stay above 1500 bps. Another issue is smtpd_timeout granularity. Currently it is the same for all SMTP commands, but some suggested it makes sense to distinguish between some of the SMTP stages. Suppose we have two settings, one for the DATA stage where we have valid recipients, and one for the non-DATA stages. Then, the default settings would look like this: smtpd_timeout = ${stress?10s}${stress:300s} smtpd_data_timeout = $smtpd_timeout $smtpd_non_data_timeout = $smtpd_timeout smtpd_hard_error_limit = ${stress?1}${stress:20} I would not make different stress dependencies for _data_timeout and _non_data_timeout, but people would be welcome to use the extra rope and shoot themselves into the foot. Wietse
Re: Enabling stress detection by default
On Fri, Feb 06, 2009 at 01:37:41PM -0500, Wietse Venema wrote: smtpd_timeout = ${stress?10s}${stress:300s} smtpd_hard_error_limit = ${stress?2}${stress:20} I guess disabling reverse DNS lookups under stress is too drastic. It would certainly not help folks with reject_unknown_client, even if implemented correctly as a transient (due to stress) lookup failure. Another issue is smtpd_timeout granularity. Currently it is the same for all SMTP commands, but some suggested it makes sense to distinguish between some of the SMTP stages. I think I once suggested shorter timeouts outside the mail transaction (before MAIL FROM or after .). This would prevent abuse of the MTA by software with poor connection caching strategies. If we limit it to just after ., the shorter timeout could be on by default, even with no stress. Did not envision short timeouts between MAIL and DATA, but that was long before -o stress. -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the Reply-To header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: mailto:majord...@postfix.org?body=unsubscribe%20postfix-users If my response solves your problem, the best way to thank me is to not send an it worked, thanks follow-up. If you must respond, please put It worked, thanks in the Subject so I can delete these quickly.
Re: Enabling stress detection by default
Wietse Venema wrote: Something that will drastically cut the time per session: smtpd_timeout = ${stress?10s}${stress:300s} I would be concerned about sites that are chronically short of smtpd processes with an inexperienced or inattentive admin. Maybe 20s~30s rather than 10s. That's still 10x or more better performance under stress, and 30s has been shown to be safe for everyday use. smtpd_hard_error_limit = ${stress?2}${stress:20} Yes. Or stress?1. Whatever... Suppose we have two settings, one for the DATA stage where we have valid recipients, and one for the non-DATA stages. Then, the default settings would look like this: smtpd_timeout = ${stress?10s}${stress:300s} smtpd_data_timeout = $smtpd_timeout $smtpd_non_data_timeout = $smtpd_timeout This sounds appealing, but I don't have the information to know if different timeouts would make much real-world difference. Victor Duchovni wrote: I guess disabling reverse DNS lookups under stress is too drastic. It would certainly not help folks with reject_unknown_client, even if implemented correctly as a transient (due to stress) lookup failure. Too many people rely on name-based whitelists and blacklists. Such behavior would be quite surprising for an out-of-the-box install. but I think you know that already. -- Noel Jones
Re: Enabling stress detection by default
Noel Jones: Wietse Venema wrote: Something that will drastically cut the time per session: smtpd_timeout = ${stress?10s}${stress:300s} I would be concerned about sites that are chronically short of smtpd processes with an inexperienced or inattentive admin. Maybe 20s~30s rather than 10s. That's still 10x or more better performance under stress, and 30s has been shown to be safe for everyday use. I see that observation as an argument for 30s timeouts under stress :-) (The idea is to help properly configured sites primarily. Sites that are permanently overloaded will already be missing some email, due to TCP-level or SMTP-level timeouts.) smtpd_hard_error_limit = ${stress?2}${stress:20} Yes. Or stress?1. Whatever... Yes, 1 will do. We're doing this only under conditions where some mail is already not getting through. Wietse
Re: Enabling stress detection by default
Wietse Venema: smtpd_timeout = ${stress?10s}${stress:300s} smtpd_hard_error_limit = ${stress?2}${stress:20} I thought this was going to be easy, but the built-in default values for these parameters are type int, and do not accept the conditional expressions. Either this means changing the way that built-in defaults are implemented, or adding the stress-style default settings during installation time. Wietse
Re: Enabling stress detection by default
Wietse Venema wrote: Noel Jones: Wietse Venema wrote: Something that will drastically cut the time per session: smtpd_timeout = ${stress?10s}${stress:300s} I would be concerned about sites that are chronically short of smtpd processes with an inexperienced or inattentive admin. Maybe 20s~30s rather than 10s. That's still 10x or more better performance under stress, and 30s has been shown to be safe for everyday use. I see that observation as an argument for 30s timeouts under stress :-) (The idea is to help properly configured sites primarily. Sites that are permanently overloaded will already be missing some email, due to TCP-level or SMTP-level timeouts.) Yes, I see that too. I'm concerned about the possibility of stress=yes for an extended period of time with no one at the wheel. I agree this is an edge case and should only affect servers that are already somewhat broken. I'm just trying to think of robust defaults that won't give the appearance of making things worse. -- Noel Jones
Re: Enabling stress detection by default
Wietse Venema wrote: Noel Jones: Wietse Venema wrote: Something that will drastically cut the time per session: smtpd_timeout = ${stress?10s}${stress:300s} I would be concerned about sites that are chronically short of smtpd processes with an inexperienced or inattentive admin. Maybe 20s~30s rather than 10s. That's still 10x or more better performance under stress, and 30s has been shown to be safe for everyday use. I see that observation as an argument for 30s timeouts under stress :-) (The idea is to help properly configured sites primarily. Sites that are permanently overloaded will already be missing some email, due to TCP-level or SMTP-level timeouts.) Yes, I see that too. I'm concerned about the possibility of stress=yes for an extended period of time with no one at the wheel. I agree this is an edge case and should only affect servers that are already somewhat broken. I'm just trying to think of robust defaults that won't give the appearance of making things worse. -- Noel Jones ---Could there be a notification alert be sent via SMS or another means to the administrator of the post server in question, stating something is wrong with the server?