Re: 1.5 badly dies after a few seconds

2010-09-21 Thread Cyril Bonté
Le mardi 21 septembre 2010 07:38:53, Willy Tarreau a écrit : Hi Cyril, On Tue, Sep 21, 2010 at 01:50:45AM +0200, Cyril Bonté wrote: Hi Willy and Jozsef, Le lundi 20 septembre 2010 23:42:44, R.Nagy József a écrit : (...) Very nice, now we know that the FD does not get corrupted,

Re: 1.5 badly dies after a few seconds

2010-09-21 Thread R.Nagy József
Hi Cyril, Just a big thanks for picking up the issue and being able to reproduce it! :) Cheers, Joe Idézet (Cyril Bonté cyril.bo...@free.fr): Le mardi 21 septembre 2010 07:38:53, Willy Tarreau a écrit : Hi Cyril, On Tue, Sep 21, 2010 at 01:50:45AM +0200, Cyril Bonté wrote: Hi Willy and

SOLVED: 1.5 badly dies after a few seconds

2010-09-21 Thread Willy Tarreau
József, Cyril, I have a good news. Your diag was right Cyril, the call to EV_FD_CLR() was the root cause. I could simulate the bug by adding a close(cfd) just after the accept(). I won't paraphrase my explanation in the attached commit. I'm pretty sure it's OK now. The bad news is that we also

Re: SOLVED: 1.5 badly dies after a few seconds

2010-09-21 Thread Cyril Bonté
Le mardi 21 septembre 2010 21:24:41, Willy Tarreau a écrit : József, Cyril, I have a good news. Your diag was right Cyril, the call to EV_FD_CLR() was the root cause. I could simulate the bug by adding a close(cfd) just after the accept(). I won't paraphrase my explanation in the attached

Re: SOLVED: 1.5 badly dies after a few seconds

2010-09-21 Thread Willy Tarreau
On Tue, Sep 21, 2010 at 09:44:58PM +0200, Cyril Bonté wrote: Le mardi 21 septembre 2010 21:24:41, Willy Tarreau a écrit : József, Cyril, I have a good news. Your diag was right Cyril, the call to EV_FD_CLR() was the root cause. I could simulate the bug by adding a close(cfd) just after

Re: SOLVED: 1.5 badly dies after a few seconds

2010-09-21 Thread R.Nagy József
Great news! Thanks for fixing quickly, I shall be able to give a try as soon as tomorrow. :) Cheers, Joe Idézet (Willy Tarreau w...@1wt.eu): On Tue, Sep 21, 2010 at 09:44:58PM +0200, Cyril Bonté wrote: Le mardi 21 septembre 2010 21:24:41, Willy Tarreau a écrit : József, Cyril, I have

Re: 1.5 badly dies after a few seconds

2010-09-20 Thread R.Nagy József
Idézet (Willy Tarreau w...@1wt.eu): Hi Joe, On Sun, Sep 19, 2010 at 12:38:48AM +0200, R.Nagy József wrote: Okay so here it is, let it die (after just 71secs this time!) with modified srces and socket stats: fantastic ! Relevant message from debug window: setsockopt: Connection reset by

Re: 1.5 badly dies after a few seconds

2010-09-20 Thread Cyril Bonté
Hi Willy and Jozsef, Le lundi 20 septembre 2010 23:42:44, R.Nagy József a écrit : (...) Very nice, now we know that the FD does not get corrupted, but when haproxy wants to use it, it's already closed on the other side. Probably that a TCP rule causes a reject that closes the connection

Re: 1.5 badly dies after a few seconds

2010-09-20 Thread Willy Tarreau
Hi Cyril, On Tue, Sep 21, 2010 at 01:50:45AM +0200, Cyril Bonté wrote: Hi Willy and Jozsef, Le lundi 20 septembre 2010 23:42:44, R.Nagy József a écrit : (...) Very nice, now we know that the FD does not get corrupted, but when haproxy wants to use it, it's already closed on the other

Re: 1.5 badly dies after a few seconds

2010-09-18 Thread Willy Tarreau
Hi Joe, On Thu, Sep 16, 2010 at 04:49:00PM +0200, R.Nagy József wrote: Some more details, let the production server suffer 2 more times to test a narrowed down config. The new config only worked as a rate limiter 1.5.dev haproxy instance, and had a running 1.3 instance in the background

Re: 1.5 badly dies after a few seconds

2010-09-18 Thread R.Nagy József
Hi Joe, On Thu, Sep 16, 2010 at 04:49:00PM +0200, R.Nagy József wrote: Some more details, let the production server suffer 2 more times to test a narrowed down config. The new config only worked as a rate limiter 1.5.dev haproxy instance, and had a running 1.3 instance in the background

Re: 1.5 badly dies after a few seconds

2010-09-18 Thread Willy Tarreau
On Sat, Sep 18, 2010 at 03:40:39PM +0200, R.Nagy József wrote: $ socat readline unix-connect:/tmp/haproxy.sock prompt show info show stat show sess show table show table mySite-webfarm I'm particularly interested in those outputs, they will make it easier to find if we're

Re: 1.5 badly dies after a few seconds

2010-09-18 Thread R.Nagy József
Okay so here it is, let it die (after just 71secs this time!) with modified srces and socket stats: Relevant message from debug window: setsockopt: Connection reset by peer [ALERT] 260/230253 (30302) : frontend_accept(): cannot set the socket 11 in non blocking mode. Giving up 2nd try:

Re: 1.5 badly dies after a few seconds

2010-09-18 Thread Willy Tarreau
Hi Joe, On Sun, Sep 19, 2010 at 12:38:48AM +0200, R.Nagy József wrote: Okay so here it is, let it die (after just 71secs this time!) with modified srces and socket stats: fantastic ! Relevant message from debug window: setsockopt: Connection reset by peer [ALERT] 260/230253 (30302) :

Re: 1.5 badly dies after a few seconds

2010-09-16 Thread R.Nagy József
Some more details, let the production server suffer 2 more times to test a narrowed down config. The new config only worked as a rate limiter 1.5.dev haproxy instance, and had a running 1.3 instance in the background doing the real backend game. So for the 1.5 rate limiter -still dieing-

Re: 1.5 badly dies after a few seconds

2010-09-16 Thread Jozsef R.Nagy
On 2010. 09. 16. 18:14, Ross West wrote: RNJ Compiled from latest source, by make -f Makefile.bsd REGEX=pcre RNJ DEBUG= COPTS.generic=-Os -fomit-frame-pointer (no mgnu) Just getting caught up on this thread and the fact it's on Freebsd. :-) Since you're not using gmake, you have to watch

Re: 1.5 badly dies after a few seconds

2010-09-16 Thread Jozsef R.Nagy
Unfortunately the same issue :( Died after less than 10mins with: [ALERT] 258/182824 (70506) : accept(): cannot set the socket in non blocking mode. Giving up On 2010. 09. 16. 19:23, Jozsef R.Nagy wrote: On 2010. 09. 16. 18:14, Ross West wrote: RNJ Compiled from latest source, by make -f

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Willy Tarreau
On Wed, Sep 15, 2010 at 07:17:32AM +0200, R.Nagy József wrote: My bad, most likely. After killing haproxy process completely -instead of just config reloads-, and restarting it, problem can't be reproduced anymore without rate limiting config. OK, thanks for this clarification. So most

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread R.Nagy József
Thank you for the heads up, Managed to put it in a single listen block, and worked! Temporarily! ;( Was fine on testing environment, but after putting it into production, haproxy gone wild after 40mins, and then after 20mins in the next round. 'Wild' being it returner Error instead of serving

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Willy Tarreau
On Wed, Sep 15, 2010 at 10:17:53AM +0200, R.Nagy József wrote: Thank you for the heads up, Managed to put it in a single listen block, and worked! Temporarily! ;( Was fine on testing environment, but after putting it into production, haproxy gone wild after 40mins, and then after 20mins in

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Jozsef R.Nagy
You should simply disable the anti-dos protection to check the difference. Also, I can recommend you to enable the stats socket in the global config, so that you can inspect your tables or even delete entries : global stats socket /var/run/haproxy.sock level admin stats timeout

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Willy Tarreau
On Wed, Sep 15, 2010 at 10:45:12AM +0200, Jozsef R.Nagy wrote: Also, I think that what you're experiencing is that your block levels are too low and that once an IP is blocked, it remains blocked because the user continues to try to access the site. That's fairly impossible, why would static

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Jozsef R.Nagy
Hey, Think found the reason causing this, after looking and logging debug: Serving requests just goes on for a while, then suddenly: 03c0:my-webfarm.srvcls[0009:000a] 03c0:my-webfarm.clicls[0009:000a] 03c0:my-webfarm.closed[0009:000a] [ALERT] 257/101918 (78231) : accept(): cannot

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Willy Tarreau
On Wed, Sep 15, 2010 at 11:34:29AM +0200, Jozsef R.Nagy wrote: Hey, Think found the reason causing this, after looking and logging debug: Serving requests just goes on for a while, then suddenly: 03c0:my-webfarm.srvcls[0009:000a] 03c0:my-webfarm.clicls[0009:000a]

Re: 1.5 badly dies after a few seconds

2010-09-15 Thread Jozsef R.Nagy
On 2010. 09. 15. 15:08, Willy Tarreau wrote: On Wed, Sep 15, 2010 at 01:00:57PM +0200, Jozsef R.Nagy wrote: Have you found a minimal way to reproduce this ? Also did you have the tcp-request rules enabled in the conf causing this issue ? No minimal way yet, the config is the

1.5 badly dies after a few seconds

2010-09-14 Thread Jozsef R.Nagy
Hello guys, Just been testing 1.5dev2 (and most recent snapshot as well) on freebsd, evaluating it for its anti-dos capabilities. The strange thing is..it starts up just fine, serves a few pages just fine then it returns blank pages. After a minute or so it will deliver a few pages again and

Re: 1.5 badly dies after a few seconds

2010-09-14 Thread Andrew Azarov
I've also noticed it dies within some time with ddos protection, I've tested 1.5-dev1 On 14.09.2010 23:39, Jozsef R.Nagy wrote: Hello guys, Just been testing 1.5dev2 (and most recent snapshot as well) on freebsd, evaluating it for its anti-dos capabilities. The strange thing is..it starts

Re: 1.5 badly dies after a few seconds

2010-09-14 Thread Willy Tarreau
On Tue, Sep 14, 2010 at 11:39:05PM +0200, Jozsef R.Nagy wrote: Hello guys, Just been testing 1.5dev2 (and most recent snapshot as well) on freebsd, evaluating it for its anti-dos capabilities. The strange thing is..it starts up just fine, serves a few pages just fine then it returns blank

Re: 1.5 badly dies after a few seconds

2010-09-14 Thread Willy Tarreau
Hi, On Wed, Sep 15, 2010 at 12:45:28AM +0200, Andrew Azarov wrote: I've also noticed it dies within some time with ddos protection, I've tested 1.5-dev1 Did you manage to get a core or something ? Could you post your config so that we can check what is happening ? Thanks, Willy

Re: 1.5 badly dies after a few seconds

2010-09-14 Thread R.Nagy József
My bad, most likely. After killing haproxy process completely -instead of just config reloads-, and restarting it, problem can't be reproduced anymore without rate limiting config. So most likely it was simply rejecting the request where it seemed to be serving 'random' blank pages due to