Re: [Sequoia] controller hangs on broken network

Stefan Lischke Mon, 03 Sep 2007 01:07:50 -0700

Hi Don,

> Everything works fine under load, with JBoss happily hitting
> controller A, which in turn updates the database backend on A. Then I
> unplug the ethernet cable on server B, and everything hangs. JBoss
> stops, controller A stops, logging nothing. Controller B logs a
> warning that controller A has left the cluster. I wait for five
> minutes, nothing happens except a transaction timeout on the JBoss
> server. After ten minutes I plug the ethernet back in, and controller
> A logs this:
>
> 14:21:05,390 INFO           continuent.hedera.gms
> Member(address=/10.0.0.60:49573, uid=10.0.0.60:49573) failed in
>
> and comes back to life, as does JBoss. (However, the cluster remains
> broken -- neither controller sees the other any more.)
Thats the expected behavior. After a cluster split sequoia waits 120
seconds for a rejoin of the lost node. If the node will not rejoin in
these 120 seconds the cluster will be splitted. Try waiting just 60
seconds, you will see the cluster will work again normal.


There is no "pinging" logic in sequoia that looks if the lost node is
working again (after the 120 seconds). This logic is very helpful, you
have to implement it for yourself ontop of sequoia api.

hth

Stefan

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Re: [Sequoia] controller hangs on broken network

Reply via email to