Re: [Sequoia] controller creates split-brain

Olivier Fambon Mon, 15 Oct 2007 06:34:13 -0700

Christopher Gorge A. Marges [15/10/07 14:12]:
...


when we tested running the program without inserting
the "large" files, there was no problem, we tried to
insert 2000 records (just the id field for example)
and it was ok.  i suspect that the controller chokes
on the large volume of data coming from the other
machine.

could there be a way to fix this?


Hi Christopher,

thanks for the detailed report.

We experienced similar issues, with a different group-communication packagethough. Our conclusions were similar to yours: the controller gets somewhatover-whelmed when asked to insert many "large" binary data (we tested with 1 to8M blobs inserts).

Either the group-comm subsystem threads e.g. the group membership service in thecontroller's jvm or the whole machine including the backends and recovery logdbs (you are using collocated backends, and hsqldb recovery log) get slow enoughon network replies to cause a false positive on the network-partition detector.Then you have a spurious split-brain.

We fixed one performance bottleneck somewhere in the controller which narrowedthe occurrences of the issue. However, this was not sufficient it seems.


The usual workaround was to:
1/ not use hsqldb as recovery log
2/ up the timeouts on the network-partition detector


Hope this helps.

A+O.
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Re: [Sequoia] controller creates split-brain

Reply via email to