Yes, we can definitely help in reviewing the patch. thanks, Kishore G
On Sun, May 17, 2015 at 9:46 PM, Hang Qi <[email protected]> wrote: > Hi Kishore, > > Thanks, should I go ahead to create a JIRA issue, and to add a test case > and propose a patch for the fix? > > Thanks > Hang Qi > > On Sun, May 17, 2015 at 6:19 AM, kishore g <[email protected]> wrote: > >> Got it, that should be fixed. Would be great to get a patch to fix it. >> Good find. >> >> Thanks >> Kishore G >> On May 16, 2015 11:50 PM, "Hang Qi" <[email protected]> wrote: >> >>> Hi Kishore, >>> >>> Thanks for your reply. >>> >>> I am not saying I want Offline->Slave higher priority than >>> Slave->Master. I agree with you, one master is more important than two >>> slaves, and that one only applies to one partition. What I am saying is >>> during p0, p1, p2 Offline->Slave transition on node A, I also want p3, p4, >>> p5 performing Offline->Slave transition on node B at the same time, but not >>> wait until p0, p1, p2 becomes Master on node A, there begins to have >>> partition transition on node B, that's kind of waste here. >>> >>> The reason to have one transition per partition at a time is summarized >>> in following thread. >>> >>> http://mail-archives.apache.org/mod_mbox/helix-user/201503.mbox/%3CCAJ2%3DoXxBWF1VoCm%3DjjyhuFCWHuxw3wYPotGz8VRkEnzVhrmgwQ%40mail.gmail.com%3E >>> >>> Thanks >>> Hang Qi >>> >>> On Sat, May 16, 2015 at 8:23 PM, kishore g <[email protected]> wrote: >>> >>>> Thanks Hang for the detailed explanation. >>>> >>>> Before the MessageSelectionStage, there is a stage that orders the >>>> messages according to the state transition priority list. I think >>>> Slave-Master is always higher priority than offline-slave which makes sense >>>> because in general having a master is probably more important than two >>>> slaves. >>>> >>>> Can you provide the state transition priority list in your state model >>>> definition. If you think that its important to get node B to Slave state >>>> before promoting node A from Slave to Master, you can change the priority >>>> order. Note: this can be changed dynamically and does not require re >>>> starting the servers. >>>> >>>> Another question is what is the reason to have constraint #2 i.e only >>>> one transition per partition at a time. >>>> >>>> thanks, >>>> Kishore G >>>> >>>> >>>> >>>> On Sat, May 16, 2015 at 4:48 PM, Hang Qi <[email protected]> wrote: >>>> >>>>> Hi folks, >>>>> >>>>> We found a very strange behavior on message throttling of controller >>>>> when there is multiple constraints. Here is our setup ( we are using >>>>> helix-0.6.4, only one resource ) >>>>> >>>>> - constraint 1: per node constraint, we only allow 3 state >>>>> transitions happens on one node concurrently. >>>>> - constraint 2: per partition constraint, we define the state >>>>> transition priorities in the state model, and only allow one state >>>>> transition happens on one single partition concurrently. >>>>> >>>>> We are using MasterSlave state model, suppose we have two nodes A, B, >>>>> each has 8 partitions (p0-p7) respectively, and initially both A and B are >>>>> shutdown, and now we start them at the same time (say A is slightly >>>>> earlier >>>>> than B). >>>>> >>>>> The expected behavior might be >>>>> >>>>> 1. p0, p1, p2 on A starts from Offline -> Slave; p3, p4, p5 on B >>>>> starts from Offline -> Slave >>>>> >>>>> But the real result is: >>>>> >>>>> 1. p0, p1, p2 on A starts from Offline -> Slave, nothing happens >>>>> on B >>>>> 2. until p0, p1, p2 all transited to Master state, p3, p4, p5 on A >>>>> starts from Offline -> Slave; p0, p1, p2 on B starts from Offline -> >>>>> Slave >>>>> >>>>> As step Offline -> Slave might take long time, this behavior result in >>>>> very long time to bring up these two nodes (long down time result in long >>>>> catch up time as well), though ideally we should not let both nodes down >>>>> at >>>>> the same time. >>>>> >>>>> Looked at the controller code, the stage and pipeline based >>>>> implementation is well design, very easy to understand and to reason >>>>> about. >>>>> >>>>> The logic of MessageThrottleStage#throttle, >>>>> >>>>> >>>>> 1. it goes through each messages selected by >>>>> MessageSelectionStage, >>>>> 2. for each message, it goes through all selected matched >>>>> constraints, and decrease the quota of each constraints >>>>> 1. if any constraint's quota is less than 0, this message will be >>>>> marked as throttled. >>>>> >>>>> I think there is something wrong here, the message will take the quota >>>>> of constraints even it is not going to be sent out (throttled). That >>>>> explains our case, >>>>> >>>>> - all the messages have been generated by the beginning, (p0, A, >>>>> Offline->Slave), ... (p7, A, Offline->Slave), (p0, B, Offline->Slave), >>>>> ..., >>>>> (p7, B, Offline->Slave) >>>>> - in the messageThrottleStage#throttle >>>>> - (p0, A, Offline->Slave), (p1, A, Offline->Slave), (p2, A, >>>>> Offline->Slave) are good, and constraint 1 on A reaches 0, >>>>> constraint 2 on >>>>> p0, p1, p2 reaches 0 as well >>>>> - (p3, A, Offline->Slave), ... (p7, A, Offline->Slave) >>>>> throttled by constraint 1 on A, also takes the quota of constraint >>>>> 2 on >>>>> those partitions as well. >>>>> - (p0, B, Offline->Slave), ... (p7, B, Offline->Slave) >>>>> throttled by constraint 2 >>>>> - thus only (p0, A, Offline->Slave), (p1, A, Oflline->Slave), >>>>> (p2, A, Offline->Slave) has been sent out by controller. >>>>> >>>>> Does that make sense, or is there anything else you can think of to >>>>> result in this unexpected behavior? And is there any work around for it? >>>>> One thing comes into my mind is update constraint 2 to be only one state >>>>> transition is allowed of single partition on certain state transitions. >>>>> >>>>> Thanks very much. >>>>> >>>>> Thanks >>>>> Hang Qi >>>>> >>>> >>>> >>> >>> >>> -- >>> Qi hang >>> >> > > > -- > Qi hang >
