Baptiste <[email protected]> wrote:
> On Fri, Mar 27, 2015 at 11:04 PM, Michael Bayer > <[email protected]> wrote: >> Hi there - >> >> We are evaluating different ways of using the "stick match" and "backup" >> options in order to achieve certain behaviors for failover, involving a >> Galera cluster. For reference, we're picking and choosing from this blog >> post: >> http://blog.haproxy.com/2014/01/17/emulating-activepassing-application-clustering-with-haproxy/. >> >> We're basically looking to have all connections on only one server at all >> times; as soon as a server is available, everyone should be kicked off and >> blocked from any other servers in the backend. This is because while Galera >> is a "multi-master" cluster, we're trying to have the application only use >> one node as "master" at a time, at least for the moment. >> >> it seems like using just the "stick" table alone, as in: >> >> backend db-vms-galera >> option httpchk >> stick-table type ip size 1 >> stick on dst >> timeout server 90m >> server rhos-node1 rhel7-1:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions >> server rhos-node2 rhel7-2:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions >> server rhos-node3 rhel7-3:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions >> >> is not enough; from my observations running the "show table" socket command >> and watching the logs, when a node goes down, its entry still remains in the >> stick table for some period of time, and new connections, assuming they >> continue to come in fast, have no choice but to skip stick match altogether >> (I can provide logs and samples that show this happening). >> >> So we would then gather that the best configuration is this: >> >> backend db-vms-galera >> option httpchk >> stick-table type ip size 1 >> stick on dst >> timeout server 90m >> server rhos-node1 rhel7-1:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions >> server rhos-node2 rhel7-2:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions backup >> server rhos-node3 rhel7-3:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions backup >> >> this works a lot better; when I kill node1, all the connections go to node2 >> unambiguously. Then, our galera cluster in an effort to bring node1 back up, >> puts node2 into "Read only" mode, which bounces all connections to node3. At >> this point node1 comes back and then the stick table does not appear to be >> of any use; most connections continue to go to node3 and querying the stick >> table shows that it's set to node3, however a handful of requests also go >> back to node1. So again this is not a pure "active/passive" setup, multiple >> nodes get hit at the same time. Was wondering if this behavior could be >> clarified and how we can use the stick table as an absolute "gate" for all >> requests, where no connections will fall through if a server goes down or >> comes back up. >> >> I had hopes for the unusual setup of just making all three servers a >> "backup" server: >> >> backend db-vms-galera >> option httpchk >> timeout server 90m >> server rhos-node1 rhel7-1:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions backup >> server rhos-node2 rhel7-2:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions backup >> server rhos-node3 rhel7-3:3306 check inter 1s port 9200 >> on-marked-down shutdown-sessions backup >> >> The idea being, no server is "official", so there's nothing to "fail back" >> towards. This sort of seemed to work but still seems like it wants to "fail >> back" up from node 3 to node2, to node 1; this approach seems to do the best >> job of making sure all connections are all on one server, though; not >> perfectly because it doesn't bump off connections still talking to the >> being-replaced server, but still fairly well. >> >> Any insight on the usage of the stick table here would be appreciated! > > > Hi Michael, > > Can you add the 'nopurge' option on your stick-table statement and > tell us if that fixes your issue? Hi Baptiste - I had tried the “nopurge” option but not in conjunction with the two “backup” servers. So yes, when we have “nopurge” on in conjunction with two backup servers, even when server 1 fails, it stays permanently in the stick table, so that as soon as it’s back up the requests seem to go back to node 1 more aggressively. I’ll experiment more with this setting - thanks! I would like to understand this better though, it seems like a gray area in HAProxy as to how the stick table interacts with non-backup servers that are down, and backup servers that are active. When “nopurge” is not set, the proxy gets into the state where node3, a backup node, is the one logged in the stick table, requests go there. But then as node 1 comes back up, HAProxy seems to route connections to either node 1 (the non backup server that’s up) or node 3 (the backup server that is nevertheless matching in the “stick” table) somewhat randomly, more specifically when it needs to handle two near-simultaneous connection requests. Running "show table db-vms-galera” in a loop shows node3 is persistently in the table: # table: db-vms-galera, type: ip, size:1, used:1 0x7fb5a6215e84: key=192.168.1.200 use=0 exp=0 server_id=3 # table: db-vms-galera, type: ip, size:1, used:1 0x7fb5a6215e84: key=192.168.1.200 use=0 exp=0 server_id=3 # table: db-vms-galera, type: ip, size:1, used:1 0x7fb5a6215e84: key=192.168.1.200 use=0 exp=0 server_id=3 … continues like this ... and the logs make it clear connections are going to either node - note that in particular, it seems to occur when two requests come in at “exactly" the same time (see 12:14:11.258 on 41795, 41796, 12:14:11.260 on 41797, 41798), which looks a lot like a race condition: Mar 28 12:14:16 localhost haproxy[30229]: 192.168.1.118:41795 [28/Mar/2015:12:14:11.258] vip-db db-vms-galera/rhos-node3 1/2/5250 182230 -- 5/5/5/3/0 0/0 Mar 28 12:14:16 localhost haproxy[30229]: 192.168.1.118:41799 [28/Mar/2015:12:14:11.261] vip-db db-vms-galera/rhos-node3 1/0/5346 142414 -- 4/4/4/2/0 0/0 Mar 28 12:14:17 localhost haproxy[30229]: 192.168.1.118:41797 [28/Mar/2015:12:14:11.260] vip-db db-vms-galera/rhos-node3 1/0/5861 142312 -- 4/4/4/2/0 0/0 Mar 28 12:14:17 localhost haproxy[30229]: 192.168.1.118:41798 [28/Mar/2015:12:14:11.260] vip-db db-vms-galera/rhos-node1 1/0/6053 142301 -- 4/4/4/1/0 0/0 Mar 28 12:14:17 localhost haproxy[30229]: 192.168.1.118:41796 [28/Mar/2015:12:14:11.258] vip-db db-vms-galera/rhos-node1 1/2/6317 142180 -- 5/5/5/0/0 0/0 Mar 28 12:14:21 localhost haproxy[30229]: 192.168.1.118:41800 [28/Mar/2015:12:14:16.508] vip-db db-vms-galera/rhos-node3 1/0/5166 119038 -- 5/5/5/5/0 0/0 Mar 28 12:14:21 localhost haproxy[30229]: 192.168.1.118:41801 [28/Mar/2015:12:14:16.608] vip-db db-vms-galera/rhos-node3 1/0/5270 158273 -- 5/5/5/5/0 0/0 Mar 28 12:14:22 localhost haproxy[30229]: 192.168.1.118:41802 [28/Mar/2015:12:14:17.120] vip-db db-vms-galera/rhos-node3 1/0/5187 158359 -- 5/5/4/4/0 0/0 Mar 28 12:14:23 localhost haproxy[30229]: 192.168.1.118:41804 [28/Mar/2015:12:14:17.562] vip-db db-vms-galera/rhos-node3 1/0/5694 158702 CD 4/4/4/4/0 0/0 I’m using only a single-process HAProxy (no nbproc setting), so the appearance of a race is unusual here as I know HAProxy uses an event-driven model for internal concurrency. Can the rules and behaviors of HAProxy in this area be clarified? > > Baptiste

