Baptiste <[email protected]> wrote:

> On Fri, Mar 27, 2015 at 11:04 PM, Michael Bayer
> <[email protected]> wrote:
>> Hi there -
>> 
>> We are evaluating different ways of using the "stick match" and "backup"
>> options in order to achieve certain behaviors for failover, involving a
>> Galera cluster. For reference, we're picking and choosing from this blog
>> post:
>> http://blog.haproxy.com/2014/01/17/emulating-activepassing-application-clustering-with-haproxy/.
>> 
>> We're basically looking to have all connections on only one server at all
>> times; as soon as a server is available, everyone should be kicked off and
>> blocked from any other servers in the backend. This is because while Galera
>> is a "multi-master" cluster, we're trying to have the application only use
>> one node as "master" at a time, at least for the moment.
>> 
>> it seems like using just the "stick" table alone, as in:
>> 
>>        backend db-vms-galera
>>            option httpchk
>>            stick-table type ip size 1
>>            stick on dst
>>            timeout server 90m
>>            server rhos-node1 rhel7-1:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions
>>            server rhos-node2 rhel7-2:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions
>>            server rhos-node3 rhel7-3:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions
>> 
>> is not enough; from my observations running the "show table" socket command
>> and watching the logs, when a node goes down, its entry still remains in the
>> stick table for some period of time, and new connections, assuming they
>> continue to come in fast, have no choice but to skip stick match altogether
>> (I can provide logs and samples that show this happening).
>> 
>> So we would then gather that the best configuration is this:
>> 
>>        backend db-vms-galera
>>            option httpchk
>>            stick-table type ip size 1
>>            stick on dst
>>            timeout server 90m
>>            server rhos-node1 rhel7-1:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions
>>            server rhos-node2 rhel7-2:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions backup
>>            server rhos-node3 rhel7-3:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions backup
>> 
>> this works a lot better; when I kill node1, all the connections go to node2
>> unambiguously. Then, our galera cluster in an effort to bring node1 back up,
>> puts node2 into "Read only" mode, which bounces all connections to node3. At
>> this point node1 comes back and then the stick table does not appear to be
>> of any use; most connections continue to go to node3 and querying the stick
>> table shows that it's set to node3, however a handful of requests also go
>> back to node1. So again this is not a pure "active/passive" setup, multiple
>> nodes get hit at the same time. Was wondering if this behavior could be
>> clarified and how we can use the stick table as an absolute "gate" for all
>> requests, where no connections will fall through if a server goes down or
>> comes back up.
>> 
>> I had hopes for the unusual setup of just making all three servers a
>> "backup" server:
>> 
>>        backend db-vms-galera
>>            option httpchk
>>            timeout server 90m
>>            server rhos-node1 rhel7-1:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions backup
>>            server rhos-node2 rhel7-2:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions backup
>>            server rhos-node3 rhel7-3:3306 check inter 1s port 9200 
>> on-marked-down shutdown-sessions backup
>> 
>> The idea being, no server is "official", so there's nothing to "fail back"
>> towards. This sort of seemed to work but still seems like it wants to "fail
>> back" up from node 3 to node2, to node 1; this approach seems to do the best
>> job of making sure all connections are all on one server, though; not
>> perfectly because it doesn't bump off connections still talking to the
>> being-replaced server, but still fairly well.
>> 
>> Any insight on the usage of the stick table here would be appreciated!
> 
> 
> Hi Michael,
> 
> Can you add the 'nopurge' option on your stick-table statement and
> tell us if that fixes your issue?

Hi Baptiste -

I had tried the “nopurge” option but not in conjunction with the two
“backup” servers. So yes, when we have “nopurge” on in conjunction with two
backup servers, even when server 1 fails, it stays permanently in the stick
table, so that as soon as it’s back up the requests seem to go back to node
1 more aggressively. I’ll experiment more with this setting - thanks!

I would like to understand this better though, it seems like a gray
area in HAProxy as to how the stick table interacts with
non-backup servers that are down, and backup servers that are active. When
“nopurge” is not set, the proxy gets into the state where node3, a backup
node, is the one logged in the stick table, requests go there. But then as
node 1 comes back up, HAProxy seems to route connections to either node 1
(the non backup server that’s up) or node 3 (the backup server that is
nevertheless matching in the “stick” table) somewhat randomly, more 
specifically when it needs to handle two near-simultaneous connection 
requests.

Running "show table db-vms-galera” in a loop shows node3 is
persistently in the table:

# table: db-vms-galera, type: ip, size:1, used:1
0x7fb5a6215e84: key=192.168.1.200 use=0 exp=0 server_id=3

# table: db-vms-galera, type: ip, size:1, used:1
0x7fb5a6215e84: key=192.168.1.200 use=0 exp=0 server_id=3

# table: db-vms-galera, type: ip, size:1, used:1
0x7fb5a6215e84: key=192.168.1.200 use=0 exp=0 server_id=3

… continues like this ...

and the logs make it clear connections are going to either node - note that
in particular, it seems to occur when two requests come in at “exactly" the
same time (see 12:14:11.258 on 41795, 41796, 12:14:11.260 on 41797, 41798),
which looks a lot like a race condition:

Mar 28 12:14:16 localhost haproxy[30229]: 192.168.1.118:41795 
[28/Mar/2015:12:14:11.258] vip-db db-vms-galera/rhos-node3 1/2/5250 182230 -- 
5/5/5/3/0 0/0
Mar 28 12:14:16 localhost haproxy[30229]: 192.168.1.118:41799 
[28/Mar/2015:12:14:11.261] vip-db db-vms-galera/rhos-node3 1/0/5346 142414 -- 
4/4/4/2/0 0/0
Mar 28 12:14:17 localhost haproxy[30229]: 192.168.1.118:41797 
[28/Mar/2015:12:14:11.260] vip-db db-vms-galera/rhos-node3 1/0/5861 142312 -- 
4/4/4/2/0 0/0
Mar 28 12:14:17 localhost haproxy[30229]: 192.168.1.118:41798 
[28/Mar/2015:12:14:11.260] vip-db db-vms-galera/rhos-node1 1/0/6053 142301 -- 
4/4/4/1/0 0/0
Mar 28 12:14:17 localhost haproxy[30229]: 192.168.1.118:41796 
[28/Mar/2015:12:14:11.258] vip-db db-vms-galera/rhos-node1 1/2/6317 142180 -- 
5/5/5/0/0 0/0
Mar 28 12:14:21 localhost haproxy[30229]: 192.168.1.118:41800 
[28/Mar/2015:12:14:16.508] vip-db db-vms-galera/rhos-node3 1/0/5166 119038 -- 
5/5/5/5/0 0/0
Mar 28 12:14:21 localhost haproxy[30229]: 192.168.1.118:41801 
[28/Mar/2015:12:14:16.608] vip-db db-vms-galera/rhos-node3 1/0/5270 158273 -- 
5/5/5/5/0 0/0
Mar 28 12:14:22 localhost haproxy[30229]: 192.168.1.118:41802 
[28/Mar/2015:12:14:17.120] vip-db db-vms-galera/rhos-node3 1/0/5187 158359 -- 
5/5/4/4/0 0/0
Mar 28 12:14:23 localhost haproxy[30229]: 192.168.1.118:41804 
[28/Mar/2015:12:14:17.562] vip-db db-vms-galera/rhos-node3 1/0/5694 158702 CD 
4/4/4/4/0 0/0

I’m using only a single-process HAProxy (no nbproc setting), so the
appearance of a race is unusual here as I know HAProxy uses an event-driven
model for internal concurrency.

Can the rules and behaviors of HAProxy in this area be clarified?







> 
> Baptiste

Reply via email to