is just sitting)
>>>>
>>>> Then drop N1:
>>>> - N2 becomes LEADER
>>>> - Nothing happens to N3
>>>>
>>>> Naively, I would have expected N3 to transition from Offline to
>>>> Standby, but that doesn't happen.
>>>>
>>>> I can force redistribution from GenericHelixController#onLiveInstanceChange
>>>> by
>>>> - dropping non-live instances from the cluster
>>>> - calling rebalance
>>>>
>>>> The instance dropping seems pretty unsafe! Is there a better way?
>>>>
>>>
>>>
>>
>
--
Lei Xia
t; partition and 1 replica. Each resource would reside on exactly 1 node, and
> there is no limit on the number of resources per node.
>
> On Wed, Oct 19, 2016 at 9:23 PM, Lei Xia wrote:
>
>> Hi, Michael
>>
>> Could you be more specific on the issue you see? Specificall
o the cluster. For example, with 2 nodes + 1 resource
>>>>>>> (1
>>>>>>> replica, 1 partition) + OnlineOffline: https://gist.gi
>>>>>>> thub.com/mkscrg/628ab964995c0be914d44654d26ae561/99348c870e9
>>>>>>> f028048c1d1cfdd15976325f293f9
>>>>>>>
>>>>>>> However, this seems to be fixed at the current master branch on
>>>>>>> GitHub: https://gist.github.com/mkscrg/628ab964995c0be914d44
>>>>>>> 654d26ae561/ec26a64a74b50c8c125ccd1f9bde1d8aa848a0b5
>>>>>>>
>>>>>>> Will this fix be released in an 0.6.x version?
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>
--
*Lei Xia *Senior Software Engineer
Data Infra/Nuage & Helix
LinkedIn
l...@linkedin.com
www.linkedin.com/in/lxia1
The Apache Helix Team is pleased to announce the 9th release,
0.6.6, of the Apache Helix project.
Apache Helix is a generic cluster management framework that makes it easy
to build partitioned, fault tolerant, and scalable distributed systems.
The full release notes are available
here:http://heli
if you
have any suggestions.
Thanks
Lei
On Mon, Nov 21, 2016 at 10:28 PM, kishore g wrote:
> I like the overall idea. One concern is that it might be hard to maintain
> backward compatibility with both 0.6 and 0.7.
>
> On Mon, Nov 21, 2016 at 10:17 PM, Lei Xia wrote:
&
The Apache Helix Team is pleased to announce the 10th release, 0.6.7, of
the Apache Helix project.
Apache Helix is a generic cluster management framework that makes it easy
to build partitioned, fault tolerant, and scalable distributed systems.
The full release notes are available here:
http://he
Hi, Subramanian
Helix actually allows you to dynamically change the number of partitions
in a resource. If you are using your own customized rebalancer, i.e, your
rebalance mode set in resource's IdealState is CUSTOMIZED, what you can do
is to manipulate the IdealState's MapFields when adding
Hi, Leela
The Master/Slave model does not support that because there is no way
Helix can differeniate two slave replicas unless your application have
customized logic to perform the check. However, for your case, you can
create your own state model instead of using defaut MasterSlave model. Yo
cipant will be back online soon and
also you can tolerate losing one or more replica in short-term, then you can
set a delay time here. In which Helix will not bring a new replica before this
time. Hope that makes it more clear.
Thanks
Lei
Lei Xia
Data Infra/Helix
l...@linkedin.c
at a
>>>> time? During a state transition, a participant needs to setup proper
>>>> replication upstream for itself (in the case where it is transiting to
>>>> Slave) or other replicas (in the case it is transiting to Master). So the
>>
helix-ui is written in node.js and it does not publish any Jar or other
artifact along with our release, that is why we did not find this issue in
our release process. Our release script did not bump the version in
helix-ui submodule pom file. Let us fix the script and regenerate our
release cand
By heix-ui I meaned helix-front.
Lei
On Wed, Jan 31, 2018 at 8:49 AM Lei Xia wrote:
> helix-ui is written in node.js and it does not publish any Jar or other
> artifact along with our release, that is why we did not find this issue in
> our release process. Our release script did
n use
> the admin features like adding a cluster etc.
>
> On Jan 31, 2018 08:51, "Lei Xia" wrote:
>
>> By heix-ui I meaned helix-front.
>>
>>
>> Lei
>>
>> On Wed, Jan 31, 2018 at 8:49 AM Lei Xia wrote:
>>
>>> helix-ui is writt
Hi, Bo
Please add "TOPOLOGY_AWARE_ENABLED" : "true" to your clusterConfig and
try again?
Thanks
Lei
On Tue, Feb 13, 2018 at 2:48 PM, Bo Liu wrote:
> Hi Helix Team,
>
> I am doing some test for the Helix topology
>
> The cluster configuration is:
> DELAY_REBALANCE_DISABLE : "false"DELAY_REB
et hosted on the resource.
>> We tried to disable&enable the resource. It doesn't change the states of
>> any partitions. So I guess it only disables Helix management for the
>> resource?
>>
>> --
>> Best regards,
>> Bo
>>
>>
>
--
Lei Xia
upport throttling state transition at partition level?
>>>> I only find cluster, resource and instance level throttling as below:
>>>>
>>>> public enum ThrottleScope {
>>>> CLUSTER,
>>>> RESOURCE,
>>>> INSTANCE
>>>> }
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Bo
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Bo
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Bo
>>
>>
>
--
Lei Xia
We tried to disable it through helix ui and restful api.
>
> Yes, I think it's not caused by the delay feature. Because the disabled
> resource stayed at online state forever.
>
> On Feb 16, 2018 08:48, "Lei Xia" wrote:
>
>> Hi, Bo
>>
>> Disable a res
ed By
> ==
> lock-group_0 localhost_12000
> lock-group_1 localhost_12001
> lock-group_10 localhost_12002
> lock-group_11 localhost_12000
> lock-group_2 localhost_12001
> lock-group_3 localhost_12002
> lock-group_4 localhost_12000
> lock-group_5 localhost_12001
> lock-group_6 localhost_12002
> lock-group_7 localhost_12000
> lock-group_8 localhost_12001
> lock-group_9 localhost_12002
>
--
Lei Xia
host_12002
> lock-group_11 localhost_12000
> lock-group_2 localhost_12001
> lock-group_3 localhost_12002
> lock-group_4 localhost_12000
> lock-group_5 localhost_12001
> lock-group_6 localhost_12002
> lock-group_7 localhost_12000
> lock-group_8 localhost_12001
>
Sorry, I totally missed this email thread.
Yes, we do have such feature in 0.8 to protect the cluster in case of
disasters happening. A new config option "MAX_OFFLINE_INSTANCES_ALLOWED"
can be set in ClusterConfig. If it is set, and the number of offline
instances reach to the set limit in the c
ve to call HelixAdmin.enableMaintenanceMode() manually to exit
the maintenance mode. Support of auto existing maintenance mode is on our
road-map.
Lei
Lei Xia
Data Infra/Helix
l...@linkedin.com<mailto:l...@linkedin.com>
www.linkedin.com/in/lxia1<http://www.li
tting started for the first time. Will it
>> get enabled only after min nodes are started?
>>
>> thanks
>>
>> On Mon, Mar 19, 2018 at 6:42 PM, Lei Xia wrote:
>>
>>> Actually we already supported maintenance mode in 0.8.0. My bad.
>>>
>>&g
issue before?
>
> Thanks,
> --
> Best regards,
> Bo
>
> --
Lei Xia
Hi, Bo
Helix participant creates a thread-pool to handle the state transition
by default, and the application can supply its own thread-pool for specific
state-transition too. The default thread-pool size is 40, which is
configurable.
Lei
On Mon, Jul 16, 2018 at 11:26 AM, Bo Liu wrote:
> H
s a little suspicious, as it is a cached
> thread pool, which could terminate and create threads on the fly. However,
> I am not sure if it is used for state transitioning for OnlineOffline model?
>
>
>
> On Mon, Jul 16, 2018 at 12:33 PM Lei Xia wrote:
>
>> Hi, Bo
>>
ure requests are routed to the correct node, in this case
> a node that is the master of that particular partition?
>
> Regards,
>
> Rob
>
--
Lei Xia
list =
routingTableProvider.getInstances("data2", "data2_0","MASTER");
On Fri, Oct 5, 2018 at 4:07 PM Rob McKinnon wrote:
> Lei - I am using version 0.8.2
>
> On Fri, Oct 5, 2018 at 7:02 PM Lei Xia wrote:
>
>> Hi, Rob
>>
>>Which
tate(Conf.CLUSTER_NAME, Conf.RESOURCE_NAME,
> idealState);
>
>
> admin.rebalance(Conf.CLUSTER_NAME, RESOURCE_NAME, NUM_REPLICAS);
> }
>
> ==
> I was expecting that when calling the "admin.rebalance" method, it would
> invoke "MyRebalance" code but when I run it "MyRebalance" code was not
> invoked.
>
>
> Thanks,
>
> Rob
>
--
Lei Xia
covery-for-rocksplicator-f1f8fd35c833
*
* Airbnb’s Change Data Capture system:
https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
Lei Xia
Data Infra/Helix
l...@linkedin.com<mailto:l...@linkedin.com>
www.linkedin.c
tps://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer>
Mirror of Apache Helix. Contribute to apache/helix development by creating an
account on GitHub.
github.com
Lei Xia
Data Infra/Helix
l...@linkedin.com<mailto:l...@linkedin.com>
www.linkedin.com/in/
partition (P) is also started on the new node (N2).
>>>>>
>>>>> 3. N1 can be put out of service only when all running jobs (J) on it
>>>>> are over, at this point only N2 will serve P request.
>>>>>
>>>>> Questions :
>>>>> 1. Can drain process be modeled using helix?
>>>>> 2. If yes, Is there any recipe / pointers for a helix state model?
>>>>> 3. Is there any custom way to trigger state transitions? From
>>>>> documentation, I gather that Helix controller in full auto mode, triggers
>>>>> state transitions only when number of partitions change or cluster changes
>>>>> (node addition or deletion)
>>>>> 3.I guess spectator will be needed, to custom routing logic in such
>>>>> cases, any pointers for the the same?
>>>>>
>>>>> Thank You
>>>>> Santosh
>>>>>
>>>>
--
Lei Xia
x27;s a
>>> central strorage for state of cluster which I can use for my routing logic.
>>> 3. A job could be running for hours and thus drain can happen for a long
>>> time.
>>>
>>>
>>> " How long you would expect OFFLINE->UP take here, if i
void offlineToSlave(Message message, NotificationContext context) {
> //don't return until long long running job is running
> }
>
> On Wed, May 13, 2020 at 10:40 PM Lei Xia wrote:
>
>> Hi, Santosh
>>
>> Thanks for explaining your case in detail. In this cas
Updating the state on the model just updates the local variable and doesn't
> notify the controller.
>
> Any pointers or examples would be appreciated.
>
> Thanks,
> Imran
>
--
Lei Xia
34 matches
Mail list logo