Re: [Linux-HA] Maintaining TCP State and configuring Conntrackd

2015-02-17 Thread Dejan Muhamedagic
Hi,

On Mon, Feb 16, 2015 at 06:07:26AM -0800, David Lang wrote:
 On Mon, 16 Feb 2015, Barry Haycock wrote:
 
 I am building a corosync/pacemaker/haproxy HA load balancer in
 Active/Active mode using ClusterIP. As this built on RHEL 6.5 I am
 restricted to using PCS to configure the LB.
 
 One of the requirements is to maintain TCP state so that TCP based
 syslog audit is not lost during a fail over.
 
 I have two questions:
 
 1) is it possible when using conntrackd to maintain TCP state to
 have a seamless transition to the remaining LB should one of the
 servers be shutdown. The work group in question cannot afford to
 loose any messages once the connection has commenced. Some
 machines will be using a reliable transmission method for syslog
 such as RELP but others will be using raw TCP.
 
 My testing shows that when sending a large of raw TCP messages via
 a single connection, the syslog server will loose messages when
 one of the LBs are shutdown or put into standby. The client
 machine will start ARPing for the mac address assigned to the VIP
 till a connection is established with the remaining LB. This can
 loose us up to 3 seconds worth of messages. In reality I don't
 expect such a large amount of traffic to be generated via a single
 connection. But the work group will not accept the solution if we
 loose any messages.
 
 Will this be a matter of managing the expectations of the work
 group, that during fail over, messages in transit will be lost
 when using raw TCP?
 
 Keep in mind that syncing the session state takes time, and so there
 will always be some window of time that the state exists on one
 machine and not the other. If you are unlucky enough, the failover
 will happen in the small timeframe where the connection data is just
 out of date enough to cause grief. If enough state has been synced
 to keep the connection from being broken, you will not loose any
 data. But there is always going to be a window where a new
 connection is established, and data sent over it, but the backup box
 doesn't know that the connection exists.
 
 So you do have to set expectations that when things go wrong, there
 may be a small hiccup. I would try to leave the statement general
 rather than trying to specify exactly what conditions could cause
 problems.
 
 The only way to prevent this would be for the conntrack update
 between machines to happen synchronously (including getting the ack
 from the updated machine that it has saved the data), and that would
 cripple your throughput.
 
 Also remember that there are other failure conditions that can cause
 you to loose messages. If the receiving software restarts, the
 messages that are in flight will be lost (with plain TCP, everything
 send but not written to non-volitile media is lost, with RELP things
 received but not written is lost)
 
 Really, the only way to not loose something is to have an
 application level acknowlegement that's only sent after the data is
 safe on redundant non-volitile media.

Not an easy to solve problem. Don't have any experience with it
personally, but wasn't ocf:heartbeat:portblock supposed to help a
bit in such cases?

Thanks,

Dejan

 David Lang
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Maintaining TCP State and configuring Conntrackd

2015-02-16 Thread David Lang

On Mon, 16 Feb 2015, Barry Haycock wrote:

I am building a corosync/pacemaker/haproxy HA load balancer in Active/Active 
mode using ClusterIP. As this built on RHEL 6.5 I am restricted to using PCS 
to configure the LB.


One of the requirements is to maintain TCP state so that TCP based syslog 
audit is not lost during a fail over.


I have two questions:

1) is it possible when using conntrackd to maintain TCP state to have a 
seamless transition to the remaining LB should one of the servers be shutdown. 
The work group in question cannot afford to loose any messages once the 
connection has commenced. Some machines will be using a reliable transmission 
method for syslog such as RELP but others will be using raw TCP.


My testing shows that when sending a large of raw TCP messages via a single 
connection, the syslog server will loose messages when one of the LBs are 
shutdown or put into standby. The client machine will start ARPing for the mac 
address assigned to the VIP till a connection is established with the 
remaining LB. This can loose us up to 3 seconds worth of messages. In reality 
I don't expect such a large amount of traffic to be generated via a single 
connection. But the work group will not accept the solution if we loose any 
messages.


Will this be a matter of managing the expectations of the work group, that 
during fail over, messages in transit will be lost when using raw TCP?


Keep in mind that syncing the session state takes time, and so there will always 
be some window of time that the state exists on one machine and not the other. 
If you are unlucky enough, the failover will happen in the small timeframe where 
the connection data is just out of date enough to cause grief. If enough state 
has been synced to keep the connection from being broken, you will not loose any 
data. But there is always going to be a window where a new connection is 
established, and data sent over it, but the backup box doesn't know that the 
connection exists.


So you do have to set expectations that when things go wrong, there may be a 
small hiccup. I would try to leave the statement general rather than trying to 
specify exactly what conditions could cause problems.


The only way to prevent this would be for the conntrack update between machines 
to happen synchronously (including getting the ack from the updated machine that 
it has saved the data), and that would cripple your throughput.


Also remember that there are other failure conditions that can cause you to 
loose messages. If the receiving software restarts, the messages that are in 
flight will be lost (with plain TCP, everything send but not written to 
non-volitile media is lost, with RELP things received but not written is lost)


Really, the only way to not loose something is to have an application level 
acknowlegement that's only sent after the data is safe on redundant non-volitile 
media.


David Lang
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Maintaining TCP State and configuring Conntrackd

2015-02-16 Thread Barry Haycock


I am building a corosync/pacemaker/haproxy HA load balancer in Active/Active 
mode using ClusterIP. As this built on RHEL 6.5 I am restricted to using PCS to 
configure the LB.

One of the requirements is to maintain TCP state so that TCP based syslog audit 
is not lost during a fail over. 

I have two questions: 

1) is it possible when using conntrackd to maintain TCP state to have a 
seamless transition to the remaining LB should one of the servers be shutdown. 
The work group in question cannot afford to loose any messages once the 
connection has commenced. Some machines will be using a reliable transmission 
method for syslog such as RELP but others will be using raw TCP. 

My testing shows that when sending a large of raw TCP messages via a single 
connection, the syslog server will loose messages when one of the LBs are 
shutdown or put into standby. The client machine will start ARPing for the mac 
address assigned to the VIP till a connection is established with the remaining 
LB. This can loose us up to 3 seconds worth of messages. In reality I don't 
expect such a large amount of traffic to be generated via a single connection. 
But the work group will not accept the solution if we loose any messages. 

Will this be a matter of managing the expectations of the work group, that 
during fail over, messages in transit will be lost when using raw TCP?

2) I have been looking for instructions to implement conntrackd as a resource 
using PCS in order to maintain TCP state and haven't had any luck. All 
instructions I have found implement conntrackd using cman. 
If anyone has an example for implementing conntrackd via pcs it would be much 
appreciated.

-- 

Barry

Banpen Fugyou - 10,000 Changes, No surprises



This message was sent using IMP, the Internet Messaging Program.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems