Re: Enabling stress detection by default

2009-02-07 Thread Wietse Venema
Daniel V. Reinhardt:
 ---Could there be a notification alert be sent via SMS or another
 means to the administrator of the post server in question, stating
 something is wrong with the server?

This could be implemented by configuring a logfile monitoring
program (swatch, logsurfer, etc.) to send notifications when it
sees logging with the following format:

smtpd[%d]: warning: service %s (%s) has reached its process limit %d: new 
clients may experience noticeable delays

Wietse


Re: Enabling stress detection by default

2009-02-07 Thread Noel Jones

Daniel V. Reinhardt wrote:


---Could there be a notification alert be sent via SMS or another means 
to the administrator of the post server in question, stating something 
is wrong with the server?




The discussion is about proposed default postfix behavior, so 
no, out of band notifications cannot be a default.


The default behavior of postfix is to put scary sounding 
entries in the system log when something goes wrong.  It's up 
to the admin to review the logs and/or configure third-party 
system monitor software to watch for interesting events.



--
Noel Jones


Enabling stress detection by default

2009-02-06 Thread Wietse Venema
With Postfix 2.5 I introdoced stress-dependent behavior in the SMTP
server, but this was left turned off by default.

I'm thinking of turning on some stress-dependent behavior by default
in Postfix 2.6, to make Postfix look better in stupid benchmarks
(just like in_flow_delay and smtpd_client_connection_count_limit).

While I am reluctant to make the default SMTP server timeouts
drastically shorter, I am inclined to ship Postfix 2.6 with settings
that will be relatively safe under conditions of overload.

Something that will drastically cut the time per session:

smtpd_timeout = ${stress?10s}${stress:300s}
smtpd_hard_error_limit = ${stress?2}${stress:20}

This would affect clients that cannot send SMTP commands within
10s from each other (i.e. round-trip time 5s) or cannot send blocks
of message content within 10s from each other (i.e. an effective
bit rate of  1500 bits/second).

Keep in mind that we're doing this only when the server is saturated.
By not receiving very slow mail, we create more opportunities to
receive other mail that arrives in less time. This assumes that
the condition is temporary, and that some bit rates will stay above
1500 bps.

Another issue is smtpd_timeout granularity. Currently it is the
same for all SMTP commands, but some suggested it makes sense to
distinguish between some of the SMTP stages.

Suppose we have two settings, one for the DATA stage where we have
valid recipients, and one for the non-DATA stages.

Then, the default settings would look like this:

smtpd_timeout = ${stress?10s}${stress:300s}
smtpd_data_timeout = $smtpd_timeout
$smtpd_non_data_timeout = $smtpd_timeout
smtpd_hard_error_limit = ${stress?1}${stress:20}

I would not make different stress dependencies for _data_timeout
and _non_data_timeout, but people would be welcome to use the extra
rope and shoot themselves into the foot.

Wietse


Re: Enabling stress detection by default

2009-02-06 Thread Victor Duchovni
On Fri, Feb 06, 2009 at 01:37:41PM -0500, Wietse Venema wrote:

 smtpd_timeout = ${stress?10s}${stress:300s}
 smtpd_hard_error_limit = ${stress?2}${stress:20}

I guess disabling reverse DNS lookups under stress is too drastic. It
would certainly not help folks with reject_unknown_client, even if
implemented correctly as a transient (due to stress) lookup failure.

 Another issue is smtpd_timeout granularity. Currently it is the
 same for all SMTP commands, but some suggested it makes sense to
 distinguish between some of the SMTP stages.

I think I once suggested shorter timeouts outside the mail transaction
(before MAIL FROM or after .). This would prevent abuse of the MTA
by software with poor connection caching strategies. If we limit it to
just after ., the shorter timeout could be on by default, even with
no stress. Did not envision short timeouts between MAIL and DATA,
but that was long before -o stress.

-- 
Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the Reply-To header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
mailto:majord...@postfix.org?body=unsubscribe%20postfix-users

If my response solves your problem, the best way to thank me is to not
send an it worked, thanks follow-up. If you must respond, please put
It worked, thanks in the Subject so I can delete these quickly.


Re: Enabling stress detection by default

2009-02-06 Thread Noel Jones

Wietse Venema wrote:

Something that will drastically cut the time per session:

smtpd_timeout = ${stress?10s}${stress:300s}


I would be concerned about sites that are chronically short of 
smtpd processes with an inexperienced or inattentive admin.
Maybe 20s~30s rather than 10s.  That's still 10x or more 
better performance under stress, and 30s has been shown to be 
safe for everyday use.



smtpd_hard_error_limit = ${stress?2}${stress:20}


Yes.  Or stress?1.  Whatever...


Suppose we have two settings, one for the DATA stage where we have
valid recipients, and one for the non-DATA stages.

Then, the default settings would look like this:

smtpd_timeout = ${stress?10s}${stress:300s}
smtpd_data_timeout = $smtpd_timeout
$smtpd_non_data_timeout = $smtpd_timeout


This sounds appealing, but I don't have the information to 
know if different timeouts would make much real-world difference.


Victor Duchovni wrote:
 I guess disabling reverse DNS lookups under stress is too 
drastic. It
 would certainly not help folks with 
reject_unknown_client, even if
 implemented correctly as a transient (due to stress) 
lookup failure.


Too many people rely on name-based whitelists and blacklists. 
 Such behavior would be quite surprising for an 
out-of-the-box install.  but I think you know that already.


--
Noel Jones


Re: Enabling stress detection by default

2009-02-06 Thread Wietse Venema
Noel Jones:
 Wietse Venema wrote:
  Something that will drastically cut the time per session:
  
  smtpd_timeout = ${stress?10s}${stress:300s}
 
 I would be concerned about sites that are chronically short of 
 smtpd processes with an inexperienced or inattentive admin.
 Maybe 20s~30s rather than 10s.  That's still 10x or more 
 better performance under stress, and 30s has been shown to be 
 safe for everyday use.

I see that observation as an argument for  30s timeouts under
stress :-) 

(The idea is to help properly configured sites primarily.  Sites
that are permanently overloaded will already be missing some email,
due to TCP-level or SMTP-level timeouts.)

  smtpd_hard_error_limit = ${stress?2}${stress:20}
 
 Yes.  Or stress?1.  Whatever...

Yes, 1 will do. We're doing this only under conditions where some
mail is already not getting through.


Wietse


Re: Enabling stress detection by default

2009-02-06 Thread Wietse Venema
Wietse Venema:
 smtpd_timeout = ${stress?10s}${stress:300s}
 smtpd_hard_error_limit = ${stress?2}${stress:20}

I thought this was going to be easy, but the built-in default
values for these parameters are type int, and do not accept
the conditional expressions.

Either this means changing the way that built-in defaults are
implemented, or adding the stress-style default settings during
installation time.

Wietse


Re: Enabling stress detection by default

2009-02-06 Thread Noel Jones

Wietse Venema wrote:

Noel Jones:

Wietse Venema wrote:

Something that will drastically cut the time per session:

smtpd_timeout = ${stress?10s}${stress:300s}
I would be concerned about sites that are chronically short of 
smtpd processes with an inexperienced or inattentive admin.
Maybe 20s~30s rather than 10s.  That's still 10x or more 
better performance under stress, and 30s has been shown to be 
safe for everyday use.


I see that observation as an argument for  30s timeouts under
stress :-) 


(The idea is to help properly configured sites primarily.  Sites
that are permanently overloaded will already be missing some email,
due to TCP-level or SMTP-level timeouts.)


Yes, I see that too.  I'm concerned about the possibility of 
stress=yes for an extended period of time with no one at the 
wheel.


I agree this is an edge case and should only affect servers 
that are already somewhat broken.  I'm just trying to think of 
robust defaults that won't give the appearance of making 
things worse.


--
Noel Jones


Re: Enabling stress detection by default

2009-02-06 Thread Daniel V. Reinhardt


 


Wietse Venema wrote:
 Noel Jones:
 Wietse Venema wrote:
 Something that will drastically cut the time per session:
 
 smtpd_timeout = ${stress?10s}${stress:300s}
 I would be concerned about sites that are chronically short of smtpd 
 processes with an inexperienced or inattentive admin.
 Maybe 20s~30s rather than 10s.  That's still 10x or more better performance 
 under stress, and 30s has been shown to be safe for everyday use.
 
 I see that observation as an argument for  30s timeouts under
 stress :-) 
 (The idea is to help properly configured sites primarily.  Sites
 that are permanently overloaded will already be missing some email,
 due to TCP-level or SMTP-level timeouts.)

Yes, I see that too.  I'm concerned about the possibility of stress=yes for an 
extended period of time with no one at the wheel.

I agree this is an edge case and should only affect servers that are already 
somewhat broken.  I'm just trying to think of robust defaults that won't give 
the appearance of making things worse.

-- Noel Jones

---Could there be a notification alert be sent via SMS or another means to the 
administrator of the post server in question, stating something is wrong with 
the server?