On 2010. 09. 15. 12:51, Willy Tarreau wrote:
On Wed, Sep 15, 2010 at 11:34:29AM +0200, Jozsef R.Nagy wrote:
Hey,
Think found the reason causing this, after looking and logging debug:
Serving requests just goes on for a while, then suddenly:
000003c0:my-webfarm.srvcls[0009:000a]
000003c0:my-webfarm.clicls[0009:000a]
000003c0:my-webfarm.closed[0009:000a]
[ALERT] 257/101918 (78231) : accept(): cannot set the socket in non
blocking mode. Giving up
Good catch! It's the first time I've ever seen that error. What
annoys me most is that it does not look possible. The file descriptor
passed to fcntl() in session_accept() is the same as the one returned
by accept() in stream_sock_accept(). So what I'm suspecting now is
that either something corrupts the stack, ot that someone closes the
same FD by error at one point. In either case, it's not funny at all :-(
Have you found a minimal way to reproduce this ? Also did you have the
tcp-request rules enabled in the conf causing this issue ?
No minimal way yet, the config is the 'full' one i've set over
previously with 2 listens (and no frontend/backend blocks) with the mods
you've recommended:
### (d)dos protection ###
stick-table type ip size 1m expire 5m store gpc0,conn_rate(10s)
acl source_is_abuser src_get_gpc0 gt 0
tcp-request connection track-sc1 src if ! source_is_abuser
acl conn_rate_abuse sc1_conn_rate gt 40
acl mark_as_abuser sc1_inc_gpc0 gt 0
use_backend ease-up if source_is_abuser
use_backend ease-up if conn_rate_abuse mark_as_abuser
So yea tcp-request rules were enabled.
Not sure how to reproduce it for getting to minimal way, as it only
happened 4 times on production setup, and can't really afford having it
dead a few more times atm :/
On test instance I can't get to reproducing it just yet..prolly not
enough traffic or concurrency simply?
Thanks,
Willy
Thanks, Joe