On Fri, Jul 13, 2007 at 07:44:08PM -0700, Paul B. Henson wrote:
On Wed, 11 Jul 2007, Robert Felber wrote:
This could happen if all policyd-weight processes are hogged up. Should
be logged with MAX_PROX NN reached. How many policyd-weight childs do
you have at such moments?
There are no instances of that message in my logs. I currently have the
maximum number of processes set to 100.
Alternatively, what is your kernel setting for somaxconn? If it is 128
then you should increase it to 1024 or some higher value (this is a
general recommendation for any server). This isn't being logged by
policyd-weight, as this cannot be detected by polw.
somaxconn is currently the default, which I believe is 128.
You should really increase this. I will update the setup howto as well.
This level has caused many problems in the past.
I currently have the maximum number of postfix smtp processes set to 300,
so the theory here is that all 100 policyd-weight processes are busy, 128
postfix processes are attempting to connect and sitting in the listen
queue, and then the 129th+ processes get connection timed out?
Yes because policyd-weight childrens all are in a accept state. If the kernel
doesnt provide a socket-descriptor due to somaxconn issues the policyd-weight
returns to accept() on its listen socket.
At some time postfix will timeout.
But that
doesn't make sense, because shouldn't policyd-weight log a notification
when it tried to start the 101st process which would have exceeded the
maximum?
Yes. How many policyd-weight instances are up at this time?
The only way the queue backlog should exceed 128 is if that many
connections are made without policyd-weight doing an accept?
Or not being able to do a sane accept().
--
Robert Felber (PGP: 896CF30B)
Munich, Germany
Policyd-weight Mailinglist - http://www.policyd-weight.org/