On Fri, Feb 12, 2010 at 11:17 PM, Steven Dake <[email protected]> wrote:
> On Fri, 2010-02-12 at 12:51 -0700, hj lee wrote: > > Hi, > > > > If there are only two nodes in cluster and their IP addresses are > > known a prior, then isn't it better to use TCP as a transport layer? > > With heartbeat, there is a way to configure nodes before starting > > cluster. Is TCP ever considered in corosync? > > > > Not sure why tcp would be a better transport layer for two nodes. > TCP/IP's key driving design factor from Darpa was to remain operational > and _mask faults_ (not detect faults) under nuclear attack where many > network links and routing systems would be under considerable changing > stress. As a result TCP/IP is very resilient to faulty networks and > packet loss but does not provide suitable fault detection. Further > there is not automatic node discovery in TCP/IP. In short, TCP/IP while > highly versatile doesn't offer the best characteristics for cluster > communication. > > Finally, Corosync is designed for nway redundant cluster configurations. > The 2N model is a simplification of the nway redundant model and we > don't provide special behaviors during 2N operation. > > Regards > -steve > Actually I changed the code to use TCP just for sending a token, it is working very well. "very well" means I do not see token timeout any more. I know this is an ugly hack! The main reason I want to use TCP is I am seeing token lost timeout in heavy load, so the cluster is divided for very short time, which caused some problem in my application. If I run the system over night, I usually see one or two token lost timeout. the easy fix will be increase token timeout, but I have a strict requirement on timeout, so I couldn't increase it. So I tried to use TCP. After adding TCP transport just for token transmit, this timeout does not happen any more. The token is transmitted by unicast, so this chagne will work with more than two nodes. And I think there will be cases or environment this TCP token transmit may be useful or work better. At least it solves my case. Thanks hj -- Peakpoint Service Cluster Setup, Troubleshooting & Development [email protected] (303) 997-2823
_______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
