Great, thank you for the prompt reply. Thanks, Bo
On Tue, Jan 23, 2018 at 1:47 PM, kishore g <[email protected]> wrote: > > 1. Yes, you can set the max transitions constraint on per partition, > per instance, per resource scope. There is a helix admin API to set the > constraint. I dont have it handy. > 2. Yes, Helix will send OFFLINE->SLAVE transitions to all partitions > that were on the host and still present in the idealstate. If its removed > from Idealstate, it will send OFFLINE->DROPPED transition. > 3. Right. Expiry is same as a restart. The only difference is > with expiry, it calls reset method on the statemodel where one can plugin > custom behavior. > > > > On Tue, Jan 23, 2018 at 11:57 AM, Bo Liu <[email protected]> wrote: > >> Thanks Kishore & Lei! >> >> It's a good point to rely on the data in a local partition to decide if a >> bootstrap is needed or catching up is good enough.' >> >> A few more questions. >> >> 1. is there a way to allow at most one transition for a partition at a >> time? During a state transition, a participant needs to setup proper >> replication upstream for itself (in the case where it is transiting to >> Slave) or other replicas (in the case it is transiting to Master). So the >> participant needs to learn the ip:port for other replicas in the cluster. >> No concurrent transitions allowed for a partition will make it much easier. >> >> 2. When a participant restarts, I assume it will connect to ZK with a new >> session id. With DelayedAutoRebalancer, helix will not move replicas >> away from the participants, but it will promote some Slave replicas on >> other hosts to be the new Masters. Once the restarted host is back, will >> helix send "OFFLINE -> SLAVE" transition requests to it for all the >> partitions that were on this participant before the restart? >> >> 3. When the ZK session is expired on a participant (no restart), helix >> will behave the same, i.e., sending "OFFLINE->SLAVE" for all partitions >> to the participant once it reconnect to ZK, right? >> >> Thanks, >> Bo >> >> On Tue, Jan 23, 2018 at 10:39 AM, kishore g <[email protected]> wrote: >> >>> Relying on reuse of the same statemodel instance by Helix might make it >>> model too rigid and tied to current implementation in Helix. Let's not >>> expose that to the clients. >>> >>> Helix internally carries over the previous partitions assignment during >>> startup but sets the state to initial state (OFFLINE in this case) by >>> default. If the client really needs to know what was the previous state, we >>> can provide a hook to the client to compute the initial state. In any case, >>> lets hear more from Bo before making any changes. >>> >>> On Tue, Jan 23, 2018 at 9:19 AM, Lei Xia <[email protected]> wrote: >>> >>>> Hi, Bo >>>> >>>> >>>> As Kishore commented, your offline->slave state transition callback >>>> needs some logic to determine whether a bootstrap or catch up is needed to >>>> transit a replica to slave. A common way is to persist the data version of >>>> a local partition somewhere, and during offline->slave, comparing local >>>> version (if there is) with current Master's version to determine if >>>> bootstrap (if version is null or too old) or catch-up is needed. >>>> >>>> >>>> There is one more difference in how Helix handles participant >>>> restarting vs ZK session. When a participant starts (or restarts), it >>>> creates a new StateModel (by calling CreateStateModel() in your >>>> StateModelFactory) for each partition. However, if a participant loses ZK >>>> session and comes back (with new session), it will reuse the StateModel for >>>> partitions that were there before instead of creating a new one. You may >>>> leverage this to tell whether a participant has been restarted or just >>>> re-established the ZK connection. >>>> >>>> >>>> In addition, the Delayed feature in DelayedAutoRebalancer is a little >>>> different then what you may understand. When you lose a participant (e.g, >>>> crashed, in maintenance), you lose one replica for some partitions. In >>>> this situation, Helix will usually bring up a new replica in some other >>>> live node immediately to maintain the required replica count. >>>> However, this may bring performance impact since bringing a new replica can >>>> require data bootstrap in new node. If you expect the original participant >>>> will be back online soon and also you can tolerate losing one or more >>>> replica in short-term, then you can set a delay time here. In which Helix >>>> will not bring a new replica before this time. Hope that makes it more >>>> clear. >>>> >>>> >>>> >>>> >>>> Thanks >>>> >>>> Lei >>>> >>>> >>>> >>>> >>>> *Lei Xia* >>>> >>>> >>>> Data Infra/Helix >>>> >>>> [email protected] >>>> www.linkedin.com/in/lxia1 >>>> ------------------------------ >>>> *From:* Bo Liu <[email protected]> >>>> *Sent:* Monday, January 22, 2018 11:12:48 PM >>>> *To:* [email protected] >>>> *Subject:* differentiate between bootstrap and a soft failure >>>> >>>> Hi There, >>>> >>>> I am using FULL_AUTO with MasterSlave and DelayedAutoRebalancer. How >>>> can a participant differentiate between these two cases: >>>> >>>> 1) when a participant first joins a cluster, it will be requested to >>>> transit from OFFLINE to SLAVE. Since the participant doesn't have any data >>>> for this partition, it needs to bootstrap and download data from another >>>> participant or a data source. >>>> 2) when a participant loses its ZK session, the controller will >>>> automatically change the participant to be OFFLINE in ZK. If the >>>> participant manages to establish a new session to ZK before the delayed >>>> time threshold, the controller will send a request to it to switch from >>>> OFFLINE to SLAVE. In this case, the participant already has the data for >>>> the partition, so it doesn't need to bootstrap from other data sources. >>>> >>>> -- >>>> Best regards, >>>> Bo >>>> >>>> >>> >> >> >> -- >> Best regards, >> Bo >> >> > -- Best regards, Bo
