Got it, that should be fixed. Would be great to get a patch to fix it. Good find.
Thanks Kishore G On May 16, 2015 11:50 PM, "Hang Qi" <[email protected]> wrote: > Hi Kishore, > > Thanks for your reply. > > I am not saying I want Offline->Slave higher priority than Slave->Master. > I agree with you, one master is more important than two slaves, and that > one only applies to one partition. What I am saying is during p0, p1, p2 > Offline->Slave transition on node A, I also want p3, p4, p5 performing > Offline->Slave transition on node B at the same time, but not wait until > p0, p1, p2 becomes Master on node A, there begins to have partition > transition on node B, that's kind of waste here. > > The reason to have one transition per partition at a time is summarized in > following thread. > > http://mail-archives.apache.org/mod_mbox/helix-user/201503.mbox/%3CCAJ2%3DoXxBWF1VoCm%3DjjyhuFCWHuxw3wYPotGz8VRkEnzVhrmgwQ%40mail.gmail.com%3E > > Thanks > Hang Qi > > On Sat, May 16, 2015 at 8:23 PM, kishore g <[email protected]> wrote: > >> Thanks Hang for the detailed explanation. >> >> Before the MessageSelectionStage, there is a stage that orders the >> messages according to the state transition priority list. I think >> Slave-Master is always higher priority than offline-slave which makes sense >> because in general having a master is probably more important than two >> slaves. >> >> Can you provide the state transition priority list in your state model >> definition. If you think that its important to get node B to Slave state >> before promoting node A from Slave to Master, you can change the priority >> order. Note: this can be changed dynamically and does not require re >> starting the servers. >> >> Another question is what is the reason to have constraint #2 i.e only one >> transition per partition at a time. >> >> thanks, >> Kishore G >> >> >> >> On Sat, May 16, 2015 at 4:48 PM, Hang Qi <[email protected]> wrote: >> >>> Hi folks, >>> >>> We found a very strange behavior on message throttling of controller >>> when there is multiple constraints. Here is our setup ( we are using >>> helix-0.6.4, only one resource ) >>> >>> - constraint 1: per node constraint, we only allow 3 state >>> transitions happens on one node concurrently. >>> - constraint 2: per partition constraint, we define the state >>> transition priorities in the state model, and only allow one state >>> transition happens on one single partition concurrently. >>> >>> We are using MasterSlave state model, suppose we have two nodes A, B, >>> each has 8 partitions (p0-p7) respectively, and initially both A and B are >>> shutdown, and now we start them at the same time (say A is slightly earlier >>> than B). >>> >>> The expected behavior might be >>> >>> 1. p0, p1, p2 on A starts from Offline -> Slave; p3, p4, p5 on B >>> starts from Offline -> Slave >>> >>> But the real result is: >>> >>> 1. p0, p1, p2 on A starts from Offline -> Slave, nothing happens on B >>> 2. until p0, p1, p2 all transited to Master state, p3, p4, p5 on A >>> starts from Offline -> Slave; p0, p1, p2 on B starts from Offline -> >>> Slave >>> >>> As step Offline -> Slave might take long time, this behavior result in >>> very long time to bring up these two nodes (long down time result in long >>> catch up time as well), though ideally we should not let both nodes down at >>> the same time. >>> >>> Looked at the controller code, the stage and pipeline based >>> implementation is well design, very easy to understand and to reason about. >>> >>> The logic of MessageThrottleStage#throttle, >>> >>> >>> 1. it goes through each messages selected by MessageSelectionStage, >>> 2. for each message, it goes through all selected matched >>> constraints, and decrease the quota of each constraints >>> 1. if any constraint's quota is less than 0, this message will be >>> marked as throttled. >>> >>> I think there is something wrong here, the message will take the quota >>> of constraints even it is not going to be sent out (throttled). That >>> explains our case, >>> >>> - all the messages have been generated by the beginning, (p0, A, >>> Offline->Slave), ... (p7, A, Offline->Slave), (p0, B, Offline->Slave), >>> ..., >>> (p7, B, Offline->Slave) >>> - in the messageThrottleStage#throttle >>> - (p0, A, Offline->Slave), (p1, A, Offline->Slave), (p2, A, >>> Offline->Slave) are good, and constraint 1 on A reaches 0, constraint >>> 2 on >>> p0, p1, p2 reaches 0 as well >>> - (p3, A, Offline->Slave), ... (p7, A, Offline->Slave) throttled >>> by constraint 1 on A, also takes the quota of constraint 2 on those >>> partitions as well. >>> - (p0, B, Offline->Slave), ... (p7, B, Offline->Slave) throttled >>> by constraint 2 >>> - thus only (p0, A, Offline->Slave), (p1, A, Oflline->Slave), >>> (p2, A, Offline->Slave) has been sent out by controller. >>> >>> Does that make sense, or is there anything else you can think of to >>> result in this unexpected behavior? And is there any work around for it? >>> One thing comes into my mind is update constraint 2 to be only one state >>> transition is allowed of single partition on certain state transitions. >>> >>> Thanks very much. >>> >>> Thanks >>> Hang Qi >>> >> >> > > > -- > Qi hang >
