Hi,

On Mon, Feb 16, 2015 at 06:07:26AM -0800, David Lang wrote:
> On Mon, 16 Feb 2015, Barry Haycock wrote:
> 
> >I am building a corosync/pacemaker/haproxy HA load balancer in
> >Active/Active mode using ClusterIP. As this built on RHEL 6.5 I am
> >restricted to using PCS to configure the LB.
> >
> >One of the requirements is to maintain TCP state so that TCP based
> >syslog audit is not lost during a fail over.
> >
> >I have two questions:
> >
> >1) is it possible when using conntrackd to maintain TCP state to
> >have a seamless transition to the remaining LB should one of the
> >servers be shutdown. The work group in question cannot afford to
> >loose any messages once the connection has commenced. Some
> >machines will be using a reliable transmission method for syslog
> >such as RELP but others will be using raw TCP.
> >
> >My testing shows that when sending a large of raw TCP messages via
> >a single connection, the syslog server will loose messages when
> >one of the LBs are shutdown or put into standby. The client
> >machine will start ARPing for the mac address assigned to the VIP
> >till a connection is established with the remaining LB. This can
> >loose us up to 3 seconds worth of messages. In reality I don't
> >expect such a large amount of traffic to be generated via a single
> >connection. But the work group will not accept the solution if we
> >loose any messages.
> >
> >Will this be a matter of managing the expectations of the work
> >group, that during fail over, messages in transit will be lost
> >when using raw TCP?
> 
> Keep in mind that syncing the session state takes time, and so there
> will always be some window of time that the state exists on one
> machine and not the other. If you are unlucky enough, the failover
> will happen in the small timeframe where the connection data is just
> out of date enough to cause grief. If enough state has been synced
> to keep the connection from being broken, you will not loose any
> data. But there is always going to be a window where a new
> connection is established, and data sent over it, but the backup box
> doesn't know that the connection exists.
> 
> So you do have to set expectations that when things go wrong, there
> may be a small hiccup. I would try to leave the statement general
> rather than trying to specify exactly what conditions could cause
> problems.
> 
> The only way to prevent this would be for the conntrack update
> between machines to happen synchronously (including getting the ack
> from the updated machine that it has saved the data), and that would
> cripple your throughput.
> 
> Also remember that there are other failure conditions that can cause
> you to loose messages. If the receiving software restarts, the
> messages that are in flight will be lost (with plain TCP, everything
> send but not written to non-volitile media is lost, with RELP things
> received but not written is lost)
> 
> Really, the only way to not loose something is to have an
> application level acknowlegement that's only sent after the data is
> safe on redundant non-volitile media.

Not an easy to solve problem. Don't have any experience with it
personally, but wasn't ocf:heartbeat:portblock supposed to help a
bit in such cases?

Thanks,

Dejan

> David Lang
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to