Testing Stick table replication with snapshot 20120222

Mark Brooks Wed, 22 Feb 2012 05:32:49 -0800

Hi all,

We have been testing with stick table replication and was wondering if
we could get some clarification on its operation/possibly make a
feature request, if what we think is happening happens.


Our configuration is as follows -


global
        daemon
        stats socket /var/run/haproxy.stat mode 600 level admin
        pidfile /var/run/haproxy.pid
        maxconn 40000
        ulimit-n 81000
defaults
        mode http
        balance roundrobin
        timeout connect 4000
        timeout client 42000
        timeout server 43000
peers loadbalancer_replication
        peer instance1 192.168.66.94:7778
        peer instance2 192.168.66.95:7778
listen VIP_Name
        bind 192.100.1.2:80
        mode tcp
        balance leastconn
        server backup 127.0.0.1:9081 backup  non-stick
        stick-table type ip size 10240k expire 30m peers
loadbalancer_replication
        stick on src
        option redispatch
        option abortonclose
        maxconn 40000
        server RIP_Name 192.168.66.50  weight 1  check port 80  inter
2000  rise 2  fall 3 minconn 0  maxconn 0 on-marked-down
shutdown-sessions
        server RIP_Name-1 192.168.66.51:80  weight 1  check   inter
2000  rise 2  fall 3 minconn 0  maxconn 0 on-marked-down
shutdown-sessions



I have replication working between the devices, our issues come when
one of the nodes is lost and brought back online.
For example -

We have 2 copies of haproxy running on 2 machines called instance 1
and instance 2

Starting setup
instance 1's persistence table
entry 1
entry 2
entry 3

instance 2's persistence table
entry 1
entry 2
entry 3


instance 1 now fails and is no longer communicating with instance 2.
All the users are now connected to instance 2.

Now instance 1 is brought back online.

The users are still connecting to instance 2, but the persistence
table entries for instance 2 are only copied to instance one if the
connection is re-established (we see it as the persistence timeout
counter resetting).

So you can end up with

instance 1's persistence table
entry 1


instance 2's persistence table
entry 1
entry 2
entry 3

If you were to then cause the connections to switch over from instance
1 to instance 2, you would be missing 2 persistence entries.

Is this expected behaviour?

If it is, would it be possible to request a feature for a socket
command of some sort which you can run on a device which force
synchronises the persistence table with the other peers? So which ever
instance it is run on it takes its persistence table and pushes it to
the other peers.


Mark

Testing Stick table replication with snapshot 20120222

Reply via email to