On 13.10.2014 16:54, Baptiste wrote:
On Sun, Oct 12, 2014 at 6:47 PM, Benjamin Vetter <[email protected]> wrote:
Hi,

i'm using the example from
http://blog.haproxy.com/2014/01/17/emulating-activepassing-application-clustering-with-haproxy/
with haproxy 1.5.4 for a 3 node mysql+galera setup to implement
active/passive'ness.

global
   log 127.0.0.1 local0
   log 127.0.0.1 local1 notice
   maxconn 8192
   uid 99
   gid 99
   debug
   stats socket    /tmp/haproxy

defaults
   log global
   mode http
   option tcplog
   option dontlognull
   retries 3
   maxconn 8192
   timeout connect 5000
   timeout client 300000
   timeout server 300000

listen mysql-active-passive 0.0.0.0:3309
   stick-table type ip size 1
   stick on dst
   mode tcp
   balance roundrobin
   option httpchk
   server db01 192.168.0.11:3306 check port 9200 inter 12000 rise 3 fall 3
on-marked-down shutdown-sessions
   server db02 192.168.0.12:3306 check port 9200 inter 12000 rise 3 fall 3
on-marked-down shutdown-sessions backup
   server db03 192.168.0.13:3306 check port 9200 inter 12000 rise 3 fall 3
on-marked-down shutdown-sessions backup

I tested the stickyness via this tiny ruby script, which simply connects and
asks the node for its stored ip address:

require "mysql2"

loop do
   begin
     mysql2 = Mysql2::Client.new(:port => 3309, :host => "192.168.0.10",
:username => "username")
     puts mysql2.query("show variables like '%wsrep_sst_rec%'").to_a
     mysql2.close
   rescue
     # Nothing
   end
end

First, everything's fine. On first run, stick-table gets updated:

# table: mysql-active-passive, type: ip, size:1, used:1
0x1c90224: key=192.168.0.10 use=0 exp=0 server_id=1

Then i shutdown 192.168.0.11. Again, everything's fine, as the stick table
gets updated to:

# table: mysql-active-passive, type: ip, size:1, used:1
0x1c90224: key=192.168.0.10 use=0 exp=0 server_id=2

and all connections now go to db02.

Then i restart/repair 192.168.0.11, the stick table stays as is (fine), such
that all connections should still go to db02.
However, the output of my script now starts to say:

...
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.12"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.12"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.11"}
{"Variable_name"=>"wsrep_sst_receive_address", "Value"=>"192.168.0.12"}
...

such that sometimes the connection goes to db01 and sometimes to db02.
Do you know what the problem is?

Thanks
   Benjamin




Hi Benjamin,

Could you remove the 'backup' keyword from your server lines and run
the same test?

Baptiste




Ok, after more testing and digging into the haproxy source it's more or less clear that "size 1" is the problem - in contrast to what the blog post says.

Every new client connection requires a slot in the stick table no matter if the new session/stick table entry will match the already existing stick table entry or not. Thus, if the stick table is full already (very likely for "size 1"), haproxy removes the single already existing entry. As a consequence, you need to have the "size" parameter at least as large as the number of client connections you're going to expect.

This is IMHO a bit counter-intuitive, but however ... with large "size" parameter it's working as expected.




Reply via email to