Re: set weight bug?

2013-11-27 Thread Igor
Hi, Willy, after upgraded to haproxy-ss-20131122, enable and disable
servers via socket will crash haproxy, there's no this issue in
haproxy-ss-20131031.

Bests,
-Igor


On Thu, Nov 21, 2013 at 10:42 PM, Willy Tarreau w...@1wt.eu wrote:
 Hi Igor,

 On Thu, Nov 21, 2013 at 09:03:05PM +0800, Igor wrote:
 Thanks Willy, because I use snapshot haproxy in production, so I have
 no change to do more investigation, glad you  could reproduce the bug
 :)

 now you'll have something a bit more reliable to work with. I've just
 committed the two following fixes :

 e7b7348 BUG/MEDIUM: checks: fix slow start regression after fix attempt
 004e045 BUG/MAJOR: server: weight calculation fails for map-based algorithms

 The first one is a crash when using slowstart without checks. It's not your
 case but you'll probably want to fix it in case you happen to switch to
 slowstart. I encountered it while trying to reproduce your bug based on
 Lukas' findings.

 The second one fixes the issue you're facing which is also what Lukas
 noticed (wrong weight after a set weight on a map-based algorithm).
 I can confirm that it resulted in the total backend weight to be larger
 than the table, causing out of bounds accesses after a weight change
 as soon as the map indx went far enough (depending on your load and
 the total initial weights, it could take a few seconds to a few minutes).

 The fix is quite large because I wanted to get rid of all places where
 the computations were hazardous (and there were quite a few). From my
 opinion and my tests, it now correcty covers all situations.

 Thanks for reporting it!

 Willy




Re: set weight bug?

2013-11-27 Thread Willy Tarreau
Hi Igor,

On Wed, Nov 27, 2013 at 08:46:38PM +0800, Igor wrote:
 Hi, Willy, after upgraded to haproxy-ss-20131122, enable and disable
 servers via socket will crash haproxy, there's no this issue in
 haproxy-ss-20131031.

I don't understand, I cannot reproduce this with your config, at
least the config I used to reproduce and fix the previous bugs :-(

Maybe you have something different there ?

Thanks,
Willy




Re: set weight bug?

2013-11-27 Thread Willy Tarreau
On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote:
 It's almost the same except some servers with weight=0 in conf, script
 to disable/enable servers works fine with haproxy-ss-20131031, but
 with  haproxy-ss-20131122 or newer, it will crash haproxy after
 several disable/enable commands via socket. I can mail you the conf if
 need.

That would definitely help, as even by doing what you describe above
I still fail to reproduce the issue.

Thanks Igor,
Willy




Re: set weight bug?

2013-11-27 Thread Willy Tarreau
On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote:
 It's almost the same except some servers with weight=0 in conf, script
 to disable/enable servers works fine with haproxy-ss-20131031, but
 with  haproxy-ss-20131122 or newer, it will crash haproxy after
 several disable/enable commands via socket. I can mail you the conf if
 need.

OK thank you for your config, Igor. It definitely helped. The issue was
related to the track servers with the recent changes. It caused an
(almost) infinite recursion. It was easier to see here because I have
no server responding to your addresses, so the process crashes as soon
as the first server fails the first check.

I'm attaching the fix to apply on top of your snapshot, that I'm going
to merge into mainline.

Simon, I suggest you apply the attached patch on top of your development
branch in case you still work on the lb agent checks, because then you're
affected as well.

Thanks,
Willy

From 151cd51485f8ebe47aa3b2f2eedb604d4c2cfd0d Mon Sep 17 00:00:00 2001
From: Willy Tarreau w...@1wt.eu
Date: Wed, 27 Nov 2013 16:52:23 +0100
Subject: BUG/MAJOR: fix haproxy crash when using server tracking instead of
 checks

Igor at owind reported a very recent bug (just present in latest snapshot).
Commit 4a741432 MEDIUM: Paramatise functions over the check of a server
causes up/down to die with tracked servers due to a typo.

The following call in set_server_down causes the server to put itself
down recurseively because check is the current server's check, so once
fed to the function again, it will pass through the exact same path (note
we have the exact symmetry in set_server_up) :

for (srv = s-tracknext; srv; srv = srv-tracknext)
if (!(srv-state  SRV_MAINTAIN))
/* Only notify tracking servers that are not already in 
maintenance. */
set_server_down(check);

Instead we should stop the tracking server being visited in the loop :

for (srv = s-tracknext; srv; srv = srv-tracknext)
if (!(srv-state  SRV_MAINTAIN))
/* Only notify tracking servers that are not already in 
maintenance. */
set_server_down(srv-check);

But that's not exactly enough because srv-check-server is only set when
checks are enabled, so -server is NULL for tracking servers, still causing a
crash upon first iteration. The fix is easy and consists in always initializing
check-server when creating a new server, which is what was already done a few
patches later by 69d29f9 (MEDIUM: cfgparse: Factor out check initialisation).

With the fix above alone on top of current version or snapshot 20131122, the
problem disappears.

Thanks to Igor for testing and reporting the issue.
---
 src/checks.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/checks.c b/src/checks.c
index cfbb8a3..a722c12 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -445,10 +445,10 @@ void set_server_down(struct check *check)
s-counters.down_trans++;
 
if (s-state  SRV_CHECKED)
-   for(srv = s-tracknext; srv; srv = srv-tracknext)
-   if (! (srv-state  SRV_MAINTAIN))
+   for (srv = s-tracknext; srv; srv = srv-tracknext)
+   if (!(srv-state  SRV_MAINTAIN))
/* Only notify tracking servers that 
are not already in maintenance. */
-   set_server_down(check);
+   set_server_down(srv-check);
}
 
check-health = 0; /* failure */
@@ -520,10 +520,10 @@ void set_server_up(struct check *check) {
send_log(s-proxy, LOG_NOTICE, %s.\n, trash.str);
 
if (s-state  SRV_CHECKED)
-   for(srv = s-tracknext; srv; srv = srv-tracknext)
-   if (! (srv-state  SRV_MAINTAIN))
+   for (srv = s-tracknext; srv; srv = srv-tracknext)
+   if (!(srv-state  SRV_MAINTAIN))
/* Only notify tracking servers if 
they're not in maintenance. */
-   set_server_up(check);
+   set_server_up(srv-check);
}
 
if (check-health = s-rise)
-- 
1.7.12.1



Re: set weight bug?

2013-11-27 Thread Simon Horman
On Wed, Nov 27, 2013 at 05:09:47PM +0100, Willy Tarreau wrote:
 On Wed, Nov 27, 2013 at 10:41:12PM +0800, Igor wrote:
  It's almost the same except some servers with weight=0 in conf, script
  to disable/enable servers works fine with haproxy-ss-20131031, but
  with  haproxy-ss-20131122 or newer, it will crash haproxy after
  several disable/enable commands via socket. I can mail you the conf if
  need.
 
 OK thank you for your config, Igor. It definitely helped. The issue was
 related to the track servers with the recent changes. It caused an
 (almost) infinite recursion. It was easier to see here because I have
 no server responding to your addresses, so the process crashes as soon
 as the first server fails the first check.
 
 I'm attaching the fix to apply on top of your snapshot, that I'm going
 to merge into mainline.
 
 Simon, I suggest you apply the attached patch on top of your development
 branch in case you still work on the lb agent checks, because then you're
 affected as well.

Thanks and sorry for the bug.

 
 Thanks,
 Willy
 

 From 151cd51485f8ebe47aa3b2f2eedb604d4c2cfd0d Mon Sep 17 00:00:00 2001
 From: Willy Tarreau w...@1wt.eu
 Date: Wed, 27 Nov 2013 16:52:23 +0100
 Subject: BUG/MAJOR: fix haproxy crash when using server tracking instead of
  checks
 
 Igor at owind reported a very recent bug (just present in latest snapshot).
 Commit 4a741432 MEDIUM: Paramatise functions over the check of a server
 causes up/down to die with tracked servers due to a typo.
 
 The following call in set_server_down causes the server to put itself
 down recurseively because check is the current server's check, so once
 fed to the function again, it will pass through the exact same path (note
 we have the exact symmetry in set_server_up) :
 
   for (srv = s-tracknext; srv; srv = srv-tracknext)
   if (!(srv-state  SRV_MAINTAIN))
   /* Only notify tracking servers that are not already in 
 maintenance. */
   set_server_down(check);
 
 Instead we should stop the tracking server being visited in the loop :
 
   for (srv = s-tracknext; srv; srv = srv-tracknext)
   if (!(srv-state  SRV_MAINTAIN))
   /* Only notify tracking servers that are not already in 
 maintenance. */
   set_server_down(srv-check);
 
 But that's not exactly enough because srv-check-server is only set when
 checks are enabled, so -server is NULL for tracking servers, still causing a
 crash upon first iteration. The fix is easy and consists in always 
 initializing
 check-server when creating a new server, which is what was already done a few
 patches later by 69d29f9 (MEDIUM: cfgparse: Factor out check initialisation).
 
 With the fix above alone on top of current version or snapshot 20131122, the
 problem disappears.
 
 Thanks to Igor for testing and reporting the issue.
 ---
  src/checks.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)
 
 diff --git a/src/checks.c b/src/checks.c
 index cfbb8a3..a722c12 100644
 --- a/src/checks.c
 +++ b/src/checks.c
 @@ -445,10 +445,10 @@ void set_server_down(struct check *check)
   s-counters.down_trans++;
  
   if (s-state  SRV_CHECKED)
 - for(srv = s-tracknext; srv; srv = srv-tracknext)
 - if (! (srv-state  SRV_MAINTAIN))
 + for (srv = s-tracknext; srv; srv = srv-tracknext)
 + if (!(srv-state  SRV_MAINTAIN))
   /* Only notify tracking servers that 
 are not already in maintenance. */
 - set_server_down(check);
 + set_server_down(srv-check);
   }
  
   check-health = 0; /* failure */
 @@ -520,10 +520,10 @@ void set_server_up(struct check *check) {
   send_log(s-proxy, LOG_NOTICE, %s.\n, trash.str);
  
   if (s-state  SRV_CHECKED)
 - for(srv = s-tracknext; srv; srv = srv-tracknext)
 - if (! (srv-state  SRV_MAINTAIN))
 + for (srv = s-tracknext; srv; srv = srv-tracknext)
 + if (!(srv-state  SRV_MAINTAIN))
   /* Only notify tracking servers if 
 they're not in maintenance. */
 - set_server_up(check);
 + set_server_up(srv-check);
   }
  
   if (check-health = s-rise)
 -- 
 1.7.12.1
 




RE: set weight bug?

2013-11-06 Thread Lukas Tribus
Hi!


 Here is my config http://pastie.org/private/wf0dv30krqpasgmhtdnahw
 (Deleted some servers and two backends for clear config)

 I used script to handle servers weight since haproxy-ss-20131031, so I
 never tried previous versions.

Sorry, still can't reproduce.


For problem 1 (after setting weight the unix socket, authentication on the
http stats doesn't work anymore):
can you send us the output of the following curl test, when its working and
another output when its not working:

 curl -v http://admin:passw0rd@127.0.0.1:11190/ha-stats; /dev/null


For problem 2 (haproxy crashes after multiple fast set weight commands):
please generate a coredump, that will help the developers here understand
what happens:

http://www.mail-archive.com/haproxy@formilux.org/msg09472.html



Regards,

Lukas 


RE: set weight bug?

2013-11-05 Thread Lukas Tribus
Hi Igor,


 Using newest snapshot, when I do

 echo set weight s1/p1 100| socat stdio /tmp/haproxy

 to a server already has weight 100, then fresh haproxy's stat page, it
 requires password, and it doesn't accept the right password set in
 stats auth until I reload the haproxy.

 I have a script to set servers weight, I found sometimes set weight to
 servers rapidly, like multi echo set weight s(*)/p(*) 100| socat
 stdio /tmp/haproxy, will crash haproxy daemon.


I can't reproduce neither problem. Can you post your configurations so we
can try to reproduce?

If you know a certain snapshot/release where this worked fine for you,
please tell, this will help reducing the regression range (if its a bug).



Regards,

Lukas 


Re: set weight bug?

2013-11-05 Thread Igor
Here is my config http://pastie.org/private/wf0dv30krqpasgmhtdnahw
(Deleted some servers and two backends for clear config)

I used script to handle servers weight since haproxy-ss-20131031, so I
never tried previous versions.

Bests,
-Igor


On Wed, Nov 6, 2013 at 5:55 AM, Lukas Tribus luky...@hotmail.com wrote:
 Hi Igor,


 Using newest snapshot, when I do

 echo set weight s1/p1 100| socat stdio /tmp/haproxy

 to a server already has weight 100, then fresh haproxy's stat page, it
 requires password, and it doesn't accept the right password set in
 stats auth until I reload the haproxy.

 I have a script to set servers weight, I found sometimes set weight to
 servers rapidly, like multi echo set weight s(*)/p(*) 100| socat
 stdio /tmp/haproxy, will crash haproxy daemon.


 I can't reproduce neither problem. Can you post your configurations so we
 can try to reproduce?

 If you know a certain snapshot/release where this worked fine for you,
 please tell, this will help reducing the regression range (if its a bug).



 Regards,

 Lukas