Thank you Kanak and Kishore! I will try enforcing the per-partition constraint and let you know if somehow it does not work. I was looking at the throttling documentation, but somehow missed that a per-partition constraint was an option!
Regards, Vlad On Thu, Apr 3, 2014 at 5:42 PM, kishore g <[email protected]> wrote: > Hi Vlad, > > You can try setting the transition priority order and a constraint that > there should be only one transition per partition across the cluster. > > So the transition priority could be something like > > Slave-Master > Offfline -> Bootstrap > Bootstrap->Slave > Slave->Master > > For the rest not sure if order matters. > > Also set the max transitions constraint to 1 per partition. > > The reason I put Slave-Master before Offline->Bootstrap is to ensure that > availability is given more importance. For example if you have 3 nodes, N1, > N2, N3. N1 is Master, N2 is Slave, and N3 is down. If N1 goes down and N3 > comes up at the same time. We probably dont want to wait for N3 to > bootstrap before promoting N2 to Master. > > I haven't tested this but assuming the constraints enforcement works, this > should do the trick. > > Does this make sense? Let me know if this does not work, we can add a test > case. > > thanks, > Kishore G > > > > > > > On Thu, Apr 3, 2014 at 4:57 PM, [email protected] <[email protected]>wrote: > >> >> Dear all, >> >> I am trying to construct a state model with the following transition >> diagram: >> >> OFFLINE -> BOOTSTRAPPING <---> SLAVE <-----> MASTER >> <----------------------------------- >> >> That is, an offline mode can go into a bootstraping state, from the >> bootstrap state it can go into a slave state, >> from slave it can go from master, from master to slave and from slave it >> can go offline. >> >> Assume that if I have a partition with two nodes pf1 and pf2 and a >> partition partition_0 with the following ideal state: >> >> partition_0: pf2: MASTER pf1: SLAVE, >> >> and that currently pf1 is serving as a master. When pf2 boots, Helix will >> issue, almost simultaneously, two commands: >> for pf1: transition from MASTER to SLAVE >> for pf2: transition from BOOTSTRAPPING to SLAVE >> >> My understanding is that this happens since Helix is trying to execute as >> many commands in parallel and since the last state >> has pf2 as master. However, the transition from BOOTSTRAPPING to SLAVE >> for pf2 involves a long data copy step, so >> I would like to keep pf1 as a master in the meanwhile. I tried >> prioritizing the transition from BOOTSTRAPPING to SLAVE >> over the transition from MASTER to SLAVE, however Helix still issues them >> in parallel (as it should). >> >> I was wondering what my options would be in order to keep the master up >> while the future master is bootstrapping. Could >> a throttling in the number of transitions be enforced at partition level? >> Could I somehow specify that a state with a slave >> and a bootstrapping node is undesirable? >> >> As a note, I have also looked at the RSync-replicateed filesystem >> example. The reason for not using the OfflineOnline or the >> MasterSlave model in my application is that I would like the >> bootstrapping node to receive updates from clients, i.e. be visible >> during the bootstrap. For this reason, I am introducing the new >> BOOTSTRAPPING phase in-between OFFLINE and SLAVE. >> >> Regards, >> Vlad >> >> >> PS: The state model definition is as follows: >> >> builder.addState(MASTER, 1); >> >> >> >> builder.addState(SLAVE, 2); >> >> >> >> builder.addState(BOOTSTRAP, 3); >> >> >> >> builder.addState(OFFLINE); >> >> >> >> builder.addState(DROPPED); >> >> >> >> // Set the initial state when the node starts >> >> >> >> builder.initialState(OFFLINE); >> >> >> >> >> >> >> >> // Add transitions between the states. >> >> >> >> builder.addTransition(OFFLINE, BOOTSTRAP, 4); >> >> >> >> builder.addTransition(BOOTSTRAP, SLAVE, 5); >> >> >> >> builder.addTransition(SLAVE, MASTER, 6); >> >> >> >> builder.addTransition(MASTER, SLAVE, 3); >> >> >> >> builder.addTransition(SLAVE, OFFLINE, 2); >> >> >> >> builder.addTransition(OFFLINE, DROPPED, 1); >> >> >> >> >> >> >> >> // set constraints on states. >> >> >> >> // static constraint >> >> >> >> builder.upperBound(MASTER, 1); >> >> >> >> // dynamic constraint, R means it should be derived based on >> the replication >> >> >> // factor. >> >> >> >> builder.dynamicUpperBound(SLAVE, "R"); >> > >
