Re: maintenance mode and server affinity

2011-08-02 Thread Willy Tarreau
Hi James,

On Mon, Aug 01, 2011 at 04:05:41PM -0400, James Bardin wrote:
 I have a number if instances using tcp mode, and a stick-table on src
 ip for affinity. When a server is in maintenance mode, clients with an
 existing affinity will still connect to the disabled server, and only
 be re-dispatched if the connection fails (and error responses from the
 backend are still successful tcp connections).

Are you sure your server was set in maintenance mode, did you not just
set its weight to zero ?

There is a big difference between zero weight and maintenance mode :
  - zero weight means the server is not selected in load balancing,
which means it will not get any new visitor, but will still get
existing visitors ;

  - maintenance means the server is offline and must not receive
any traffic at all, except the admin's tests selected with
force-persist rules.

So if this is not what you're observing, then it's a bug and we need
to see how to reproduce it in order to fix it.

 I've done a few things to stop this traffic when needed:
  - drop the packets on the load balancer with a null route or iptables.
  - block the packets with the firewall on the backend server, and let
 the clients get re-dispatched.
  - shutdown the services that could response from the backend, and 
 re-dispatch.
 
 
 Have I missed any configuration in haproxy that will completely stop
 traffic to a backend? I have no problem managing this as-is myself,
 but having fewer pieces involved makes delegating administration
 responsibilities easier.

I agree with you. The maintenance mode was done exactly for what you
need so I want to ensure it works.

 Willy, is a block server option (or maybe a drop table to get rid
 of affinity sessions), something that could be implemented?

I think the later can be done on the stats socket using clear table,
because you can specify a rule to select which entries to clear, so you
can clear any entry matching your server's ID. But it's only in 1.5, not
in a stable release.

Regards,
Willy




Re: maintenance mode and server affinity

2011-08-02 Thread James Bardin
On Tue, Aug 2, 2011 at 2:52 AM, Willy Tarreau w...@1wt.eu wrote:


 Are you sure your server was set in maintenance mode, did you not just
 set its weight to zero ?


Yes. I've confirmed that when using a stick-table for persistence,
putting a server in maintenance mode does not block traffic from
existing sessions.

I'm using the latest stable 1.4.15, built on centos5.



 I think the later can be done on the stats socket using clear table,
 because you can specify a rule to select which entries to clear, so you
 can clear any entry matching your server's ID. But it's only in 1.5, not
 in a stable release.

I saw the clear table in the dev version after I sent this. Since it
seems that I'm experiencing a bug in maintenance mode, the proper
behavior combined with clear table would be everything I need.


If you need any more info to help troubleshoot this, let me know.

-jim



Re: maintenance mode and server affinity

2011-08-02 Thread Willy Tarreau
On Tue, Aug 02, 2011 at 09:00:08AM -0400, James Bardin wrote:
 On Tue, Aug 2, 2011 at 2:52 AM, Willy Tarreau w...@1wt.eu wrote:
 
 
  Are you sure your server was set in maintenance mode, did you not just
  set its weight to zero ?
 
 
 Yes. I've confirmed that when using a stick-table for persistence,
 putting a server in maintenance mode does not block traffic from
 existing sessions.
 
 I'm using the latest stable 1.4.15, built on centos5.

OK thanks for confirming. Could you check if you have option persist
somewhere in your config ? From what I can tell from the code, this is
the only reason why a server set in maintenance mode would be selected :

if ((srv-state  SRV_RUNNING) ||
(px-options  PR_O_PERSIST) ||
(s-flags  SN_FORCE_PRST)) {
s-flags |= SN_DIRECT | SN_ASSIGNED;
set_target_server(s-target, srv);
}

- the server does not have the SRV_RUNNING flag in maintenance mode
- the persist option on the backend might be one reason
- I'm assuming there is no force-persist rule

If you have option persist, you should definitely remove it, as it's
done exactly for the behaviour you're experiencing : force a persistent
connection to go to a server even if it's marked as dead, and only
redispatch in case of connection error.

  I think the later can be done on the stats socket using clear table,
  because you can specify a rule to select which entries to clear, so you
  can clear any entry matching your server's ID. But it's only in 1.5, not
  in a stable release.
 
 I saw the clear table in the dev version after I sent this. Since it
 seems that I'm experiencing a bug in maintenance mode, the proper
 behavior combined with clear table would be everything I need.
 
 If you need any more info to help troubleshoot this, let me know.

If you don't have option persist, please post your config (or send it
privately if you prefer). Anyway, *please* remove any possible password
or sensible information from the config if you send it.

Willy




Re: maintenance mode and server affinity

2011-08-02 Thread James Bardin
On Tue, Aug 2, 2011 at 2:44 PM, Willy Tarreau w...@1wt.eu wrote:


 OK thanks for confirming. Could you check if you have option persist
 somewhere in your config ? From what I can tell from the code, this is
 the only reason why a server set in maintenance mode would be selected :

        if ((srv-state  SRV_RUNNING) ||
            (px-options  PR_O_PERSIST) ||
            (s-flags  SN_FORCE_PRST)) {
                s-flags |= SN_DIRECT | SN_ASSIGNED;
                set_target_server(s-target, srv);
        }

 - the server does not have the SRV_RUNNING flag in maintenance mode
 - the persist option on the backend might be one reason
 - I'm assuming there is no force-persist rule


OK, that's it.

I didn't realize that was the same code path for manually disabled
servers. I had option persist in there to prevent a server that misses
a few healthchecks under load from dumping all it's clients. Graceful
maintenance is more important than this edge case though, so I'll
remove it.


Thanks!
-jim



Re: maintenance mode and server affinity

2011-08-02 Thread Willy Tarreau
On Tue, Aug 02, 2011 at 03:08:32PM -0400, James Bardin wrote:
  - the server does not have the SRV_RUNNING flag in maintenance mode
  - the persist option on the backend might be one reason
  - I'm assuming there is no force-persist rule
 
 
 OK, that's it.

Fine!

 I didn't realize that was the same code path for manually disabled
 servers. I had option persist in there to prevent a server that misses
 a few healthchecks under load from dumping all it's clients. Graceful
 maintenance is more important than this edge case though, so I'll
 remove it.

If you want a server that misses a few health check to remain up, then
simply increase its fall parameter :-)

Thanks for the quick reply !
Willy




maintenance mode and server affinity

2011-08-01 Thread James Bardin
I have a number if instances using tcp mode, and a stick-table on src
ip for affinity. When a server is in maintenance mode, clients with an
existing affinity will still connect to the disabled server, and only
be re-dispatched if the connection fails (and error responses from the
backend are still successful tcp connections).

I've done a few things to stop this traffic when needed:
 - drop the packets on the load balancer with a null route or iptables.
 - block the packets with the firewall on the backend server, and let
the clients get re-dispatched.
 - shutdown the services that could response from the backend, and re-dispatch.


Have I missed any configuration in haproxy that will completely stop
traffic to a backend? I have no problem managing this as-is myself,
but having fewer pieces involved makes delegating administration
responsibilities easier.

Willy, is a block server option (or maybe a drop table to get rid
of affinity sessions), something that could be implemented?


Thanks,
-jim