Hi ,
Double assignment , when participant is not able to establish connection with
zookeeper quorum
Following is the set up.
Version(s) :
Helix: 0.7.1
Zookeeper:3.3.4
- State Model: OnlineOffline
- Controller (leader elected from one of the cluster nodes)
- Single resources with partitions.
- Full auto rebalancer
-Zookeeper quorum (3 nodes)
When one participant loses the zookeeper connection (It's not able to connect
to any of the zookeepers , a typical occurrence we faced was switch failure
from that rack)
---- > The partition (P1) for which this participant (say Node N1) is online
is still maintained
Meanwhile since it loses the ephemeral node in zookeeper , the rebalancer gets
triggered and it reallocates the partition (P1) to another participant node
(say Node N2) to become online @ time T1
---- > After this both N1 and N2 are acting as online for the
same Partition (P1)
But as soon as participant in (say Node N1) is able to re-establish the
zookeeper connection @ time T2
---- > Reset gets called on the partition in participant (say
Node N1)
Double assignment:
The question here is this an expected behavior that both nodes N1 and N2 could
be online for the same Partition (P1) between time (T1-T2) ? Any responses on
the same would be appreciated.
Thanks & Regards,
Subramanian.
3400 Hillview Ave, Building 4
Palo Alto, CA 94304
www.integral.com<http://www.integral.com/>
[Logo_signature_block]<http://www.integral.com/fxcloud_features/risk_management.html#ym>
NOTICE: This e-mail message and any attachments, which may contain confidential
information, are to be viewed solely by the intended recipient of Integral
Development Corp. For further information, please visit
http://www.integral.com/about/disclaimer.html.