On Sat, Apr 18, 2009 at 07:49:12AM +0200, Dietmar Maurer wrote:
> > > like a 'merge' function? Seems the algorithm for checkpoint recovery
> > > always uses the state from the node with the lowest processor id?
> > >
> > Yes that is right.
> 
> So if I have the following cluster:
> 
> Part1: node2 node3 node4
> Part2: node1
> 
> Let assume Part1 is running for some time and has gathered some state in
> checkpoints. Part2 is just the newly started node1.
> 
> So when node1 starts up the whole cluster uses the empty checkpoint from
> node1? (I  guess I am confused somehow).

It is *not* as simple as "node with the low nodeid".  It is "node with the low
nodeid where the state exists".  When selecting the node to send state to
others, you obviously need to select among nodes that have the state :-)

In the dlm_controld example I mentioned earlier, the function called
set_plock_ckpt_node() picks the node that will save state in the ckpt:

        list_for_each_entry(memb, &cg->members, list) {
                if (!(memb->start_flags & DLM_MFLG_HAVEPLOCK))
                        continue;

                if (!low || memb->nodeid < low)
                        low = memb->nodeid;
        }

Only nodes that have state will have the DLM_MFLG_HAVEPLOCK flag set; new
nodes just added by a confchg will not have that flag set.

Dave

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to