On Mon, Mar 1, 2010 at 9:40 PM, Joel Becker <joel.bec...@oracle.com> wrote: > > Two nodes is a special and difficult case. If node0 is still > heartbeating, node1 thinks it is alive; by the lowest number rule, node1 > resets. If node0 is not heartbeating (a full crash), node1 will stay > alive. As long as node0 is heartbeating, there is no way for node1 to > know that node0 is having trouble.
> If this case presents a significant problem, just add a third > node. Once there are three nodes, you always have a majority, which > takes precedence over the lowest number. > >> What is the node with the lowest number? does it have to be Node0? or >> does it mean connectivity to the lowest surviving Node? > > Here it is specifically talking about surviving nodes; these are > the nodes visible via heartbeat. Any node not heartbeating is > considered dead. So if node0 is turned off, and node1 is heartbeating, > node1 is considered the lowest surviving node. > >> I setup a test scenario with 4 nodes, 2 nodes mounting the filesystems >> and 2 other nodes just participating as network members: > > For the purposes of ocfs2, nodes that are not mounted are > invisible. Only once they mount the filesystem and start heartbeating > to they take part in quorum. > > > For your scenario, you essentially have a two-node quorum as > described above. Nodes 3&4 don't participate. Then I believe the Quorum rules in the documentation/FAQ should be updated with this info. > >> During my test (take Node0 down cold turkey) Node1 hung pretty badly, >> is this something expected?? > > What did you do to take it down? Power off? Node1 should take > around 90 seconds to notice (depending on your heartbeat timeout > settings), and then it should start recovery. > I flip the power off, on almost any test Node1 crashes as well. I don;t understand why you don't have plans to add a referential IP address to find who's on the network and who isn't, while you got a point in adding a third node won't break the bank if we're using RAC/SAP already unless we're required to get a license for that node anyway, running a node in idle mode seems a little bit wasteful, but if that solves the problem..... good I'll give it a shoot today. thanks, esv. _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users