subject:"Re\: smtpd processes congregating at the pub"

Re: smtpd processes congregating at the pub

2010-01-31 Thread Stan Hoeppner

Noel Jones put forth on 1/29/2010 8:44 AM:
> On 1/29/2010 1:37 AM, Stan Hoeppner wrote:

>>> Local shows very speedy delivery.  Is this "long" smtpd process
>>> lifespan normal
>>> for 2.5.5 or did I do something screwy/wrong in my config?
>>>
>>> relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
>>> relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
>>> relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
>>> relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
>>> relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
>>> relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent

> Nitpick: you talk about smtpd, then show log snips from smtp.  But no
> matter, they both honor max_idle and will behave in a similar manner.

Maybe I could have worded that more clearly Noel.  Those snippets are from
postfix/local not smtp.  smtp doesn't normally relay to local, afaik. ;)  I
included these snippets in an attempt to show that inbound delivery is very
fast.  Not understanding the smtpd process behavior at the time, wrt max_idle, I
assume that fast delivery would equal smtpd exiting quickly.  smtpd doesn't log
delays afaict, so I included the local information instead.

My apologies for the confusion.

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-31 Thread Stan Hoeppner

Wietse Venema put forth on 1/31/2010 7:34 PM:
> Stan Hoeppner:
>>> Better: apply the long-term solution, in the form of the patch below.
>>> This undoes the max_idle override (a workaround that I introduced
>>> with Postfix 2.3).  I already introduced the better solution with
>>> Postfix 2.4 while solving a different problem.
>>
>> I'm not sure if I fully understand this.  I'm using 2.5.5, so shouldn't I
>> already have the 2.4 solution mentioned above?  I must not be reading this
>> correctly.
> 
> The patch undoes the Postfix 2.3 change that is responsible for
> the shorter-than-expected proxymap lifetimes that you observed
> on low-traffic systems.
> 
> With that change backed out, the reduced ipc_idle change from
> Postfix 2.4 will finally get a chance to fix the excessive lifetime
> of proxymap and trivial-rewrite processes on high-traffic systems.

So, if I understand correctly, these changes made in 2.3 and 2.4 were to get
more desirable behavior from proxymap and trivial-rewrite on high traffic
systems, and this caused this (very minor) problem on low traffic systems?  The
patch resolves the low traffic issue, basically reverting to the older code used
before said 2.3 changes?

And these changes have, through 2.7, given the desired behavior on high-traffic
systems?  Or no?  Your statement "will finally get a chance to..." is future
tense.  Does this mean the desired behavior for high-traffic systems has not
been seen to date?  I apologize if this seems a stupid question.  The future
tense in your statement confuses me.  If that _is_ what you mean, future tense,
does this mean I have inadvertently played a tiny role in helping you identify a
long standing problem/issue? ;)

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-31 Thread Wietse Venema

Stan Hoeppner:
> > Better: apply the long-term solution, in the form of the patch below.
> > This undoes the max_idle override (a workaround that I introduced
> > with Postfix 2.3).  I already introduced the better solution with
> > Postfix 2.4 while solving a different problem.
> 
> I'm not sure if I fully understand this.  I'm using 2.5.5, so shouldn't I
> already have the 2.4 solution mentioned above?  I must not be reading this
> correctly.

The patch undoes the Postfix 2.3 change that is responsible for
the shorter-than-expected proxymap lifetimes that you observed
on low-traffic systems.

With that change backed out, the reduced ipc_idle change from
Postfix 2.4 will finally get a chance to fix the excessive lifetime
of proxymap and trivial-rewrite processes on high-traffic systems.

Wietse

Re: smtpd processes congregating at the pub

2010-01-31 Thread Stan Hoeppner

Wietse Venema put forth on 1/31/2010 10:38 AM:
> Stan Hoeppner:
>> This is making good progress.  Seeing the smtpd's memory footprint
>> drop so dramatically is fantastic.  However, I'm still curious as
>> to why proxymap doesn't appear to be honoring $max_idle or $max_use.
>> Maybe my understanding of $max_use is not correct?  It's currently
>> set to 100, the default.  Watching top while sending a test message
>> through, I see proxymap launch but then exit within 5 seconds,
>> while smtpd honors max_idle.  Is there some other setting I need
>> to change to keep proxymap around longer?
> 
> Short answer (workaround for low-traffic sites): set ipc_idle=$max_idle
> to approximate the expected behavior. This keeps the smtpd-to-proxymap
> connection open for as long as smtpd runs. Then, proxymap won't
> terminate before its clients terminate.

Wietse, thank you for the very thorough and thoughtful response.  For a few
reasons, including the fact I don't trust myself working with source in this
case, and that I'd rather not throw monkey wrenches into my distro's package
management, I'm going to go with the short answer workaround above.  All factors
being taken into account, I think it best fits my needs, skills, and usage 
profile.

> Better: apply the long-term solution, in the form of the patch below.
> This undoes the max_idle override (a workaround that I introduced
> with Postfix 2.3).  I already introduced the better solution with
> Postfix 2.4 while solving a different problem.

I'm not sure if I fully understand this.  I'm using 2.5.5, so shouldn't I
already have the 2.4 solution mentioned above?  I must not be reading this
correctly.

> Long answer:  in ancient times, all Postfix daemons except qmgr
> implemented the well-known max_idle=100s and max_use=100, as well
> as the lesser-known ipc_idle=100s (see "short answer" for the effect
> of that parameter).
> 
> While this worked fine for single-client servers such as smtpd, it
> was not so great for multi-client servers such as proxymap or
> trivial-rewrite.  This problem was known, and the idea was that it
> would be solved over time.
> 
> Theoretically, smtpd could run for up to $max_idle * $max_use = 3
> hours, while proxymap and trivial-rewrite could run for up to
> $max_idle * $max_use * $max_use = 12 days on low-traffic systems
> (one SMTP client every 100s, or a little under 900 SMTP clients a
> day), and it would run forever on systems with a steady mail flow.
> 
> This was a problem. The point of max_use is to limit the impact of
> bugs such as memory or file handle leaks, by retiring a process
> after doing a limited amount of work. I can test Postfix itself
> with tools such as Purify and Valgrind, but I can't do those tests
> with every version of everyone's system libraries.

This is a very smart design philosophy.  Just one more reason I feel privileged
to use Postfix.

> If a proxymap or trivial-rewrite server can run for 11 days even
> on systems with a minuscule load, then max_use isn't working as
> intended.
> 
> The main cause is that the proxymap etc. clients reuse a connection
> to improve efficiency. Therefore, the proxymap etc. server politely
> waits until all its clients have disconnected before checking the
> max_use counter.  While this politeness thing can't be changed
> easily, it is relatively easy to play with the proxymap etc. server's
> max_idle value, and with the smtpd etc.  ipc_ttl value.
> 
> Postfix 2.3 reduced the proxymap etc. max_idle to a fixed 1s value
> to make those processes go away sooner when idle.  I think that
> this was a mistake, because it makes processes terminate too soon,
> and thereby worsens the low-traffic behavior.  Instead, we should
> speed up the proxymap etc.  server's max_use counter.

> Postfix 2.4 reduced ipc_ttl to 5s. This was done for a different
> purpose: to allow proxymap etc. clients to switch to the least-loaded
> proxymap etc. server. But, I think that this was also the right way
> to deal with long-lived proxymap etc. processes, because it speeds
> up the proxymap etc.  max_use counter.

Absolutely fascinating background information Wietse.  Thank you for sharing
this.  It's always nice to learn how/why some things work "under the hood";
things that often can't easily be found in any official documentation.

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-31 Thread Wietse Venema

Stan Hoeppner:
> This is making good progress.  Seeing the smtpd's memory footprint
> drop so dramatically is fantastic.  However, I'm still curious as
> to why proxymap doesn't appear to be honoring $max_idle or $max_use.
> Maybe my understanding of $max_use is not correct?  It's currently
> set to 100, the default.  Watching top while sending a test message
> through, I see proxymap launch but then exit within 5 seconds,
> while smtpd honors max_idle.  Is there some other setting I need
> to change to keep proxymap around longer?

Short answer (workaround for low-traffic sites): set ipc_idle=$max_idle
to approximate the expected behavior. This keeps the smtpd-to-proxymap
connection open for as long as smtpd runs. Then, proxymap won't
terminate before its clients terminate.

Better: apply the long-term solution, in the form of the patch below.
This undoes the max_idle override (a workaround that I introduced
with Postfix 2.3).  I already introduced the better solution with
Postfix 2.4 while solving a different problem.

Long answer:  in ancient times, all Postfix daemons except qmgr
implemented the well-known max_idle=100s and max_use=100, as well
as the lesser-known ipc_idle=100s (see "short answer" for the effect
of that parameter).

While this worked fine for single-client servers such as smtpd, it
was not so great for multi-client servers such as proxymap or
trivial-rewrite.  This problem was known, and the idea was that it
would be solved over time.

Theoretically, smtpd could run for up to $max_idle * $max_use = 3
hours, while proxymap and trivial-rewrite could run for up to
$max_idle * $max_use * $max_use = 12 days on low-traffic systems
(one SMTP client every 100s, or a little under 900 SMTP clients a
day), and it would run forever on systems with a steady mail flow.

This was a problem. The point of max_use is to limit the impact of
bugs such as memory or file handle leaks, by retiring a process
after doing a limited amount of work. I can test Postfix itself
with tools such as Purify and Valgrind, but I can't do those tests
with every version of everyone's system libraries.

If a proxymap or trivial-rewrite server can run for 11 days even
on systems with a minuscule load, then max_use isn't working as
intended.

The main cause is that the proxymap etc. clients reuse a connection
to improve efficiency. Therefore, the proxymap etc. server politely
waits until all its clients have disconnected before checking the
max_use counter.  While this politeness thing can't be changed
easily, it is relatively easy to play with the proxymap etc. server's
max_idle value, and with the smtpd etc.  ipc_ttl value.

Postfix 2.3 reduced the proxymap etc. max_idle to a fixed 1s value
to make those processes go away sooner when idle.  I think that
this was a mistake, because it makes processes terminate too soon,
and thereby worsens the low-traffic behavior.  Instead, we should
speed up the proxymap etc.  server's max_use counter.

Postfix 2.4 reduced ipc_ttl to 5s. This was done for a different
purpose: to allow proxymap etc. clients to switch to the least-loaded
proxymap etc. server. But, I think that this was also the right way
to deal with long-lived proxymap etc. processes, because it speeds
up the proxymap etc.  max_use counter.

The patch below keeps the reduced ipc_ttl from Postfix 2.4, and
removes the max_idle overrides from Postfix 2.3.

Wietse

*** ./src/proxymap/proxymap.c-  Thu Jan 10 09:03:55 2008
--- ./src/proxymap/proxymap.c   Sun Jan 31 10:52:50 2010
***
*** 594,605 
  myfree(saved_filter);
  
  /*
-  * This process is called by clients that already enforce the max_idle
-  * time, so we don't have to do it another time.
-  */
- var_idle_limit = 1;
- 
- /*
   * Never, ever, get killed by a master signal, as that could corrupt a
   * persistent database when we're in the middle of an update.
   */
--- 594,599 
*** ./src/trivial-rewrite/trivial-rewrite.c-Wed Dec  9 18:39:51 2009
--- ./src/trivial-rewrite/trivial-rewrite.c Sun Jan 31 10:53:01 2010
***
*** 565,576 
  if (resolve_verify.transport_info)
transport_post_init(resolve_verify.transport_info);
  check_table_stats(0, (char *) 0);
- 
- /*
-  * This process is called by clients that already enforce the max_idle
-  * time, so we don't have to do it another time.
-  */
- var_idle_limit = 1;
  }
  
  MAIL_VERSION_STAMP_DECLARE;
--- 565,570

Re: smtpd processes congregating at the pub

2010-01-31 Thread Stan Hoeppner

Stan Hoeppner put forth on 1/31/2010 12:04 AM:
> Sorry for top posting.  Forgot to add something earlier:  Proxymap seems to be
> exiting on my system immediately after servicing requests.  It does not seem 
> to
> be obeying $max_use or $max_idle which are both set to 100.  It did this even
> before I added cidr lists to proxymap a few hours ago.  Before that, afaik, it
> was only being called for local alias verification, and it exited immediately 
> in
> that case as well.

Making a little more progress on this, slowly.  I'd forgotten that I have a
regexp table that's rather large, containing 1626 expressions.

I added it to proxymap, and this action dropped the size of my smtpd processes
dramatically, by about a factor of 5.  Apparently, even though this regexp table
has only 1626 lines, it requires far more memory than my big 'countries' cidr
table which has 11148 lines.

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
14411 postfix   20   0 20276  16m 1480 S8  4.3   0:00.51 proxymap
14410 postfix   20   0  6704 3368 2208 S0  0.9   0:00.04 smtpd

This is making good progress.  Seeing the smtpd's memory footprint drop so
dramatically is fantastic.  However, I'm still curious as to why proxymap
doesn't appear to be honoring $max_idle or $max_use.  Maybe my understanding of
$max_use is not correct?  It's currently set to 100, the default.  Watching top
while sending a test message through, I see proxymap launch but then exit within
5 seconds, while smtpd honors max_idle.  Is there some other setting I need to
change to keep proxymap around longer?

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-30 Thread Stan Hoeppner

Sorry for top posting.  Forgot to add something earlier:  Proxymap seems to be
exiting on my system immediately after servicing requests.  It does not seem to
be obeying $max_use or $max_idle which are both set to 100.  It did this even
before I added cidr lists to proxymap a few hours ago.  Before that, afaik, it
was only being called for local alias verification, and it exited immediately in
that case as well.

-- 
Stan


Stan Hoeppner put forth on 1/30/2010 11:13 PM:
> Wietse Venema put forth on 1/30/2010 7:14 PM:
>> Stan Hoeppner:
>>> AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
>>> (very large, a handful).
>>
>> hash (and btree) == Berkeley DB.
> 
> Ahh, good to know.  I'd thought only btree used Berkeley DB and that hash 
> tables
> used something else.
> 
>> If you have big CIDR tables, you can save lots of memory by using
>> proxy:cidr: instead of cidr: (and running "postfix reload").
>> Effectively, this turns all that private memory into something that
>> can be shared via the proxy: protocol.
> 
> I implemented proxymap but it doesn't appear to have changed the memory
> footprint of smtpd much at all, if any.  I reloaded once, and restarted once
> just in case.
> 
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>  4554 postfix   20   0 20828  17m 2268 S0  4.5   0:00.46 smtpd
>  4560 postfix   20   0 20036  16m 2268 S0  4.3   0:00.47 smtpd
>  4555 postfix   20   0  6812 3056 1416 S0  0.8   0:00.10 proxymap
> 
>> The current CIDR implementation is optimized to make it easy to
>> verify for correctness, and is optimized for speed when used with
>> limited lists of netblocks (mynetworks, unassigned address blocks,
>> reserved address blocks, etc.).
> 
> Understood.
> 
>> If you want to list large portions of Internet address space such
>> as entire countries the current implementation starts burning CPU
>> time (it examines all CIDR patterns in order; with a bit of extra
>> up-front work during initialization, address lookups could skip
>> over a lot of patterns, but the implementation would of course be
>> harder to verify for correctness), and it wastes 24 bytes per CIDR
>> rule when Postfix is compiled with IPv6 support (this roughly
>> doubles the amount memory that is used by CIDR tables).
> 
> I don't really notice much CPU burn on any postfix processes with these 
> largish
> CIDRs, never have.  I've got 12,212 CIDRs in 3 files, 11,148 of them in just 
> the
> "countries" file alone.  After implementing proxymap, I'm not seeing much
> reduction in smtpd RES size, maybe 1MB if that.  SHR is almost identical to
> before.  If it's not the big tables bloating smtpd, I wonder what is?  Or, 
> have
> I not implemented proxymap correctly?  Following are my postconf -n and 
> main.cf
> relevant parts.
> 
> alias_maps = hash:/etc/aliases
> append_dot_mydomain = no
> biff = no
> config_directory = /etc/postfix
> disable_vrfy_command = yes
> header_checks = pcre:/etc/postfix/header_checks
> inet_interfaces = all
> message_size_limit = 1024
> mime_header_checks = pcre:/etc/postfix/mime_header_checks
> mydestination = hardwarefreak.com
> myhostname = greer.hardwarefreak.com
> mynetworks = 192.168.100.0/24
> myorigin = hardwarefreak.com
> parent_domain_matches_subdomains = debug_peer_list smtpd_access_maps
> proxy_interfaces = 65.41.216.221
> proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
> $virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
> $relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
> $recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
> $sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
> proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs
> readme_directory = /usr/share/doc/postfix
> recipient_bcc_maps = hash:/etc/postfix/recipient_bcc
> relay_domains =
> smtpd_banner = $myhostname ESMTP Postfix
> smtpd_helo_required = yes
> smtpd_recipient_restrictions = permit_mynetworks
> reject_unauth_destination   check_recipient_access
> hash:/etc/postfix/whitelist  check_sender_access hash:/etc/postfix/whitelist
> check_client_access hash:/etc/postfix/whitelist check_client_access
> hash:/etc/postfix/blacklist check_client_access
> regexp:/etc/postfix/fqrdns.regexp   check_client_access
> pcre:/etc/postfix/ptr-tld.pcre check_client_access proxy:${cidr}/countries
> check_client_access proxy:${cidr}/spammer   check_client_access
> proxy:${cidr}/misc-spam-srcsreject_unknown_reverse_client_hostname
> reject_non_fqdn_sender  reject_non_fqdn_helo_hostname
> reject_invalid_helo_hostnamereject_unknown_helo_hostname
> reject_unlisted_recipient   reject_rbl_client zen.spamhaus.org
> check_policy_service inet:127.0.0.1:6
> strict_rfc821_envelopes = yes
> virtual_alias_maps = hash:/etc/postfix/virtual
> 
> /etc/postfix/main.cf snippet
> 
> cidr=cidr:/etc/postfix/cidr_files
> 
> proxy_read

Re: smtpd processes congregating at the pub

2010-01-30 Thread Stan Hoeppner

Wietse Venema put forth on 1/30/2010 7:14 PM:
> Stan Hoeppner:
>> AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
>> (very large, a handful).
> 
> hash (and btree) == Berkeley DB.

Ahh, good to know.  I'd thought only btree used Berkeley DB and that hash tables
used something else.

> If you have big CIDR tables, you can save lots of memory by using
> proxy:cidr: instead of cidr: (and running "postfix reload").
> Effectively, this turns all that private memory into something that
> can be shared via the proxy: protocol.

I implemented proxymap but it doesn't appear to have changed the memory
footprint of smtpd much at all, if any.  I reloaded once, and restarted once
just in case.

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4554 postfix   20   0 20828  17m 2268 S0  4.5   0:00.46 smtpd
 4560 postfix   20   0 20036  16m 2268 S0  4.3   0:00.47 smtpd
 4555 postfix   20   0  6812 3056 1416 S0  0.8   0:00.10 proxymap

> The current CIDR implementation is optimized to make it easy to
> verify for correctness, and is optimized for speed when used with
> limited lists of netblocks (mynetworks, unassigned address blocks,
> reserved address blocks, etc.).

Understood.

> If you want to list large portions of Internet address space such
> as entire countries the current implementation starts burning CPU
> time (it examines all CIDR patterns in order; with a bit of extra
> up-front work during initialization, address lookups could skip
> over a lot of patterns, but the implementation would of course be
> harder to verify for correctness), and it wastes 24 bytes per CIDR
> rule when Postfix is compiled with IPv6 support (this roughly
> doubles the amount memory that is used by CIDR tables).

I don't really notice much CPU burn on any postfix processes with these largish
CIDRs, never have.  I've got 12,212 CIDRs in 3 files, 11,148 of them in just the
"countries" file alone.  After implementing proxymap, I'm not seeing much
reduction in smtpd RES size, maybe 1MB if that.  SHR is almost identical to
before.  If it's not the big tables bloating smtpd, I wonder what is?  Or, have
I not implemented proxymap correctly?  Following are my postconf -n and main.cf
relevant parts.

alias_maps = hash:/etc/aliases
append_dot_mydomain = no
biff = no
config_directory = /etc/postfix
disable_vrfy_command = yes
header_checks = pcre:/etc/postfix/header_checks
inet_interfaces = all
message_size_limit = 1024
mime_header_checks = pcre:/etc/postfix/mime_header_checks
mydestination = hardwarefreak.com
myhostname = greer.hardwarefreak.com
mynetworks = 192.168.100.0/24
myorigin = hardwarefreak.com
parent_domain_matches_subdomains = debug_peer_list smtpd_access_maps
proxy_interfaces = 65.41.216.221
proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
$virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
$relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
$recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
$sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs
readme_directory = /usr/share/doc/postfix
recipient_bcc_maps = hash:/etc/postfix/recipient_bcc
relay_domains =
smtpd_banner = $myhostname ESMTP Postfix
smtpd_helo_required = yes
smtpd_recipient_restrictions = permit_mynetworks
reject_unauth_destination   check_recipient_access
hash:/etc/postfix/whitelist  check_sender_access hash:/etc/postfix/whitelist
check_client_access hash:/etc/postfix/whitelist check_client_access
hash:/etc/postfix/blacklist check_client_access
regexp:/etc/postfix/fqrdns.regexp   check_client_access
pcre:/etc/postfix/ptr-tld.pcre check_client_access proxy:${cidr}/countries
check_client_access proxy:${cidr}/spammer   check_client_access
proxy:${cidr}/misc-spam-srcsreject_unknown_reverse_client_hostname
reject_non_fqdn_sender  reject_non_fqdn_helo_hostname
reject_invalid_helo_hostnamereject_unknown_helo_hostname
reject_unlisted_recipient   reject_rbl_client zen.spamhaus.org
check_policy_service inet:127.0.0.1:6
strict_rfc821_envelopes = yes
virtual_alias_maps = hash:/etc/postfix/virtual

/etc/postfix/main.cf snippet

cidr=cidr:/etc/postfix/cidr_files

proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
$virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
$relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
$recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
$sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs

check_client_access proxy:${cidr}/countries
check_client_access proxy:${cidr}/spammer
check_client_access proxy:${cidr}/misc-spam-srcs

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-30 Thread Wietse Venema

Stan Hoeppner:
> AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
> (very large, a handful).

hash (and btree) == Berkeley DB.

If you have big CIDR tables, you can save lots of memory by using
proxy:cidr: instead of cidr: (and running "postfix reload").
Effectively, this turns all that private memory into something that
can be shared via the proxy: protocol.

The current CIDR implementation is optimized to make it easy to
verify for correctness, and is optimized for speed when used with
limited lists of netblocks (mynetworks, unassigned address blocks,
reserved address blocks, etc.).

If you want to list large portions of Internet address space such
as entire countries the current implementation starts burning CPU
time (it examines all CIDR patterns in order; with a bit of extra
up-front work during initialization, address lookups could skip
over a lot of patterns, but the implementation would of course be
harder to verify for correctness), and it wastes 24 bytes per CIDR
rule when Postfix is compiled with IPv6 support (this roughly
doubles the amount memory that is used by CIDR tables).

Wietse

Re: smtpd processes congregating at the pub

2010-01-30 Thread Stan Hoeppner

Wietse Venema put forth on 1/30/2010 9:03 AM:

> Allow me to present a tutorial on Postfix and operating system basics.

Thank you Wietse.  I'm always eager to learn. :)

> Postfix reuses processes for the same reasons that Apache does;
> however, Apache always runs a fixed minimum amount of daemons,
> whereas Postfix will dynamically shrink to zero smtpd processes
> over time.

Possibly not the best reference example, as I switched to Lighty mainly due to
the Apache behavior you describe, but also due to Apache resource hogging in
general.  But I understand your point.  It's better to keep one or two processes
resident to service the next inbound requests than to constantly tear down and
then rebuild processes, which causes significant overhead and performance issues
on busy systems.

> Therefore, people who believe that Postfix processes should not be
> running in the absence of client requests, should also terminate
> their Apache processes until a connection arrives. No-one does that.

Wouldn't that really depend on the purpose of the server?  How about a web admin
daemon running on a small network device?  I almost do this with Lighty
currently.  I have a single daemon instance that handles all requests, max
processes=1.  It's a very lightly loaded server, and a single instance is more
than enough.  In fact, given the load, I might possibly look into running Lighty
from inetd, if possible, as I do Samba.

> If people believe that each smtpd process uses 15MB of RAM, and
> that two smtpd processes use 30MB of RAM, then that would have been
> correct had Postfix been running on MS-DOS.
> 
> First, the physical memory footprint of a process (called resident
> memory size) is smaller than the virtual memory footprint (which
> comprises all addressable memory including the executable, libraries,
> data, heap and stack). With FreeBSD 8.0 I see an smtpd VSZ/RSS of
> 6.9MB/4.8MB; with Fedora Core 11, 4.2MB/1.8MB; and with FreeBSD
> 4.1 it's 1.8MB/1.4MB. Ten years of system library bloat.

Debian 5.0.3, kernel 2.6.31
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
29242 postfix   20   0 22408  18m 2268 S0  4.9   0:00.58 smtpd
29251 postfix   20   0 17264  13m 2208 S0  3.6   0:00.48 smtpd

> Second, when multiple processes execute the same executable file
> and libraries, those processes will share a single memory copy of
> the code and constants of that executable file and libraries.
> Therefore, a large portion of their resident memory sizes will
> actually map onto the same physical memory pages. 15+15 != 30.

I was of the understanding that top's SHR column described memory shareable with
other processes.  In the real example above from earlier today, it would seem
that my two smtpd processes can only share ~2.2MB of code, data structures, etc.

man top:
   t: SHR  --  Shared Mem size (kb)
  The amount of shared memory used by a task.  It simply reflects memory
that could be  potentially  shared  with  other
  processes.

Am I missing something, or reading my top output incorrectly?

> Third, some code uses mmap() to allocate memory that is mapped from
> a file.  This adds to the virtual memory footprint of each process,
> but of course only the pages that are actually accessed will add
> to the resident memory size. In the case of Postfix, this mechanism
> is used by Berkeley DB to allocate a 16MB shared-memory read buffer.

Is this 16MB buffer also used for hash and/or cidr tables, and is this
shareable?  AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
(very large, a handful).

> There are some other tricks that allow for further savings (such
> as copy-on-write, which allows sharing of a memory page until a
> process attempts to write to it) but in the case of Postfix, those
> savings will be modest.

I must be screwing something up somewhere then.  According to my top output, I'm
only sharing ~2.2MB between smtpd processes, yet I've seen them occupy anywhere
from 11-18MB RES.  If the top output is correct, there is a huge amount of
additional sharing that "should" be occurring, no?

Debian runs Postfix in a chroot by default.  I know very little about chroot
environments.  Could this have something to do with the tiny amount of shared
memory between the smtpds?

Thanks for taking interest in this Wietse.  I'm sure I've probably done
something screwy that is easily fixable, and will get that shared memory count
up where it should be.

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-30 Thread Wietse Venema

Stan Hoeppner:
> Wietse Venema put forth on 1/29/2010 6:15 AM:
> > Stan Hoeppner:
> >> Based on purely visual non-scientific observation (top), it seems my smtpd
> >> processes on my MX hang around much longer in (Debian) 2.5.5 than they did 
> >> in
> >> (Debian) 2.3.8.  In 2.3.8 Master seemed to build them and tear them down 
> >> very
> > 
> > Perhaps Debian changed this:
> > http://www.postfix.org/postconf.5.html#max_idle
> > 
> > The Postfix default is 100s.
> 
> Yes, I confirmed this on my system.
> 
> > I don't really see why anyone would shorten this - that's a waste
> > of CPU cycles. In particular, stopping Postfix daemons after 10s

Allow me to present a tutorial on Postfix and operating system basics.

Postfix reuses processes for the same reasons that Apache does;
however, Apache always runs a fixed minimum amount of daemons,
whereas Postfix will dynamically shrink to zero smtpd processes
over time.

Therefore, people who believe that Postfix processes should not be
running in the absence of client requests, should also terminate
their Apache processes until a connection arrives. No-one does that.

If people believe that each smtpd process uses 15MB of RAM, and
that two smtpd processes use 30MB of RAM, then that would have been
correct had Postfix been running on MS-DOS.

First, the physical memory footprint of a process (called resident
memory size) is smaller than the virtual memory footprint (which
comprises all addressable memory including the executable, libraries,
data, heap and stack). With FreeBSD 8.0 I see an smtpd VSZ/RSS of
6.9MB/4.8MB; with Fedora Core 11, 4.2MB/1.8MB; and with FreeBSD
4.1 it's 1.8MB/1.4MB. Ten years of system library bloat.

Second, when multiple processes execute the same executable file
and libraries, those processes will share a single memory copy of
the code and constants of that executable file and libraries.
Therefore, a large portion of their resident memory sizes will
actually map onto the same physical memory pages. 15+15 != 30.

Third, some code uses mmap() to allocate memory that is mapped from
a file.  This adds to the virtual memory footprint of each process,
but of course only the pages that are actually accessed will add
to the resident memory size. In the case of Postfix, this mechanism
is used by Berkeley DB to allocate a 16MB shared-memory read buffer.

There are some other tricks that allow for further savings (such
as copy-on-write, which allows sharing of a memory page until a
process attempts to write to it) but in the case of Postfix, those
savings will be modest.

Wietse

Re: smtpd processes congregating at the pub

2010-01-29 Thread Stan Hoeppner

Wietse Venema put forth on 1/29/2010 6:15 AM:
> Stan Hoeppner:
>> Based on purely visual non-scientific observation (top), it seems my smtpd
>> processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
>> (Debian) 2.3.8.  In 2.3.8 Master seemed to build them and tear them down very
> 
> Perhaps Debian changed this:
> http://www.postfix.org/postconf.5.html#max_idle
> 
> The Postfix default is 100s.

Yes, I confirmed this on my system.

> I don't really see why anyone would shorten this - that's a waste
> of CPU cycles. In particular, stopping Postfix daemons after 10s
> means that people don't have a clue about what they are doing.
> The fact that it's now increased to 30s confirms my suspicion.

Think of a lightly loaded (smtp connects/min) vanity domain server that
functions as a Postfix MX with local delivery, a Dovecot IMAP, a
Lighty+Roundcube, a Samba server, and a dns resolver serving local requests and
one remote workstation.  The system is also used interactively (via SSH/BASH)
for a number of things including an occasional kernel compile.  The machine only
has 384MB of RAM.  My smtp load is low enough that having an smtpd process or
two hanging around for 100 seconds just wastes 13-18MB per smtpd of memory for
80-90 of those 100 seconds.  This system regularly goes 5 minutes or more
between smtp connects.  Sometimes two come in simultaneously, and I end up with
two smtpd processes hanging around for 100 seconds, eating over 30MB RAM with no
benefit.  Thus, for me, it makes more sense to have the smtpd's exit as soon as
possible to free up memory that can be (better) used for something else.  Yes, I
guess I'm a maniac. ;)

In this scenario, with very infrequent smtpd reuse, do you still think I should
let them idle for 100 seconds, or at all?  From my perspective, that 18-30MB+
can often be better utilized during that time.

-- 
Stan

Re: smtpd processes congregating at the pub

2010-01-29 Thread Noel Jones


On 1/29/2010 1:37 AM, Stan Hoeppner wrote:

Stan Hoeppner put forth on 1/29/2010 12:27 AM:

Based on purely visual non-scientific observation (top), it seems my smtpd
processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
(Debian) 2.3.8.  In 2.3.8 Master seemed to build them and tear them down very
quickly after the transaction was complete.  An smtpd process' lifespan was
usually 10 seconds or less on my 2.3.8.  In 2.5.5 smtpd's seem to hang around
for up to 30 secs to a minute.

Local shows very speedy delivery.  Is this "long" smtpd process lifespan normal
for 2.5.5 or did I do something screwy/wrong in my config?

relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent


I think I found it:

max_idle = x

The default is 100 on my system.  I changed it to 10 and that seems to have had
an effect.

Did this setting exist in 2.3.8?  I didn't see a version note next to max_idle
in my 2.5.5 man smtpd.  If so, was the default something insanely low like 1, or
0?  Like I said, smtpd's seemed to come and go in a hurry on 2.3.8.




Nitpick: you talk about smtpd, then show log snips from smtp. 
 But no matter, they both honor max_idle and will behave in a 
similar manner.


The max_idle default has been 100s pretty much forever.  The 
idea is that an idle postfix process will be reused to do more 
work rather than starting a new process every time.  This 
makes postfix *far* more efficient than one process per job.


Although the 100s default is somewhat arbitrary, I have 
trouble imagining a situation where a shorter max_idle makes 
sense. On a very lightly loaded system where processes are 
seldom reused, a shorter max_idle might not hurt anything, but 
it won't help anything either.


  -- Noel Jones

Re: smtpd processes congregating at the pub

2010-01-29 Thread Wietse Venema

Stan Hoeppner:
> Based on purely visual non-scientific observation (top), it seems my smtpd
> processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
> (Debian) 2.3.8.  In 2.3.8 Master seemed to build them and tear them down very

Perhaps Debian changed this:
http://www.postfix.org/postconf.5.html#max_idle

The Postfix default is 100s.

I don't really see why anyone would shorten this - that's a waste
of CPU cycles. In particular, stopping Postfix daemons after 10s
means that people don't have a clue about what they are doing.
The fact that it's now increased to 30s confirms my suspicion.

Technical correctness: the Postfix master does not terminate
processes. Processes terminate voluntarily.

Wietse

Re: smtpd processes congregating at the pub

2010-01-28 Thread Stan Hoeppner

Stan Hoeppner put forth on 1/29/2010 12:27 AM:
> Based on purely visual non-scientific observation (top), it seems my smtpd
> processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
> (Debian) 2.3.8.  In 2.3.8 Master seemed to build them and tear them down very
> quickly after the transaction was complete.  An smtpd process' lifespan was
> usually 10 seconds or less on my 2.3.8.  In 2.5.5 smtpd's seem to hang around
> for up to 30 secs to a minute.
> 
> Local shows very speedy delivery.  Is this "long" smtpd process lifespan 
> normal
> for 2.5.5 or did I do something screwy/wrong in my config?
> 
> relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
> relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
> relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
> relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
> relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
> relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent

I think I found it:

max_idle = x

The default is 100 on my system.  I changed it to 10 and that seems to have had
an effect.

Did this setting exist in 2.3.8?  I didn't see a version note next to max_idle
in my 2.5.5 man smtpd.  If so, was the default something insanely low like 1, or
0?  Like I said, smtpd's seemed to come and go in a hurry on 2.3.8.

-- 
Stan

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

Re: smtpd processes congregating at the pub

15 matches

Site Navigation

Mail list logo

Footer information