Re: timeout while reading input attribute name

2007-07-14 Thread Robert Felber
On Fri, Jul 13, 2007 at 07:17:18PM -0700, Paul B. Henson wrote:
 On Wed, 11 Jul 2007, Robert Felber wrote:
 
  Which version?
 
 0.1.14.5, it looks like.
 
 
  Any warnings, error messages in advance?
 
 The only error messages I recall seeing are:
 
 
 policyd-weight[812]: rbl_lookup: unknown error

This error should happen only with versions prior to
0.1.14.2. There was a DNS nonce bug which sometimes
was 0 and thus treated wrong - leading to above error.


 policyd-weight[9910]: warning: ignoring garbage: 1

hm.

 
 They don't seem to correlate with the timeouts...

The first probably not, but the second I am not
certain. Would be interesting what caused this.

I currently cannot imagine a scenario which would
send 1 to policyd-weight.


-- 
Robert Felber (PGP: 896CF30B)
Munich, Germany


Policyd-weight Mailinglist - http://www.policyd-weight.org/


Re: timeout while reading input attribute name

2007-07-14 Thread Robert Felber
On Fri, Jul 13, 2007 at 07:44:08PM -0700, Paul B. Henson wrote:
 On Wed, 11 Jul 2007, Robert Felber wrote:
 
  This could happen if all policyd-weight processes are hogged up. Should
  be logged with MAX_PROX NN reached. How many policyd-weight childs do
  you have at such moments?
 
 There are no instances of that message in my logs. I currently have the
 maximum number of processes set to 100.
 
 
  Alternatively, what is your kernel setting for somaxconn? If it is 128
  then you should increase it to 1024 or some higher value (this is a
  general recommendation for any server). This isn't being logged by
  policyd-weight, as this cannot be detected by polw.
 
 somaxconn is currently the default, which I believe is 128.

You should really increase this. I will update the setup howto as well.
This level has caused many problems in the past.


 I currently have the maximum number of postfix smtp processes set to 300,
 so the theory here is that all 100 policyd-weight processes are busy, 128
 postfix processes are attempting to connect and sitting in the listen
 queue, and then the 129th+ processes get connection timed out?

Yes because policyd-weight childrens all are in a accept state. If the kernel
doesnt provide a socket-descriptor due to somaxconn issues the policyd-weight
returns to accept() on its listen socket.

At some time postfix will timeout.


 But that
 doesn't make sense, because shouldn't policyd-weight log a notification
 when it tried to start the 101st process which would have exceeded the
 maximum?

Yes. How many policyd-weight instances are up at this time?

 The only way the queue backlog should exceed 128 is if that many
 connections are made without policyd-weight doing an accept?

Or not being able to do a sane accept().


-- 
Robert Felber (PGP: 896CF30B)
Munich, Germany


Policyd-weight Mailinglist - http://www.policyd-weight.org/