Re: [E] Re: Maximum Partition enforcement?

Hunter Lee Mon, 08 Feb 2021 16:20:24 -0800

You might find that some classes have been moved to a separate module. Rest
assured, most are backward-compatible and the only difference should be
change in the package name. If you have any other specific questions that
you cannot resolve on your own, you can reach out to the community for
help. Depending on the complexity of your implementation, it shouldn't take
more than a day or two.


Hunter

On Mon, Feb 8, 2021 at 4:08 PM Phong X. Nguyen <[email protected]>
wrote:

> We're definitely going to give WAGED a try.
>
> Are there any constraints for upgrading from Helix 0.8.4 to 1.0.1? We were
> on 0.6 for the longest time and knew we had to upgrade first to 0.8.X.
>
> Thanks,
> - Phong X. Nguyen
>
> On Mon, Feb 8, 2021 at 4:03 PM Wang Jiajun <[email protected]> wrote:
>
>> Hi Phong,
>>
>> The WAGED rebalancer respects the MAX_PARTITIONS_PER_INSTANCE
>> automatically. So probably you don't need to do any specific configuration.
>> However, you do need to be on the new version to use the WAGED rebalancer.
>>
>> Also to confirm what you said, I believe the consistent hashing based
>> strategies (Crush and CrushEd) do not respect
>> the MAX_PARTITIONS_PER_INSTANCE. I guess there was some design concern.
>>
>> Anyway, using WAGED is the current recommendation : ) Could you please
>> have a try and let us know if it is a good fit?
>>
>> Best Regards,
>> Jiajun
>>
>>
>> On Mon, Feb 8, 2021 at 3:55 PM Xue Junkai <[email protected]> wrote:
>>
>>> CRUSHED is trying its best to evenly distribute the replicas. So you
>>> dont need identical assignments for each of the instances?
>>> If that's the case, I would suggest you to migrate to WAGED rebalancer
>>> with constraints setup. For more details, you can refer:
>>> https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_helix_wiki_Weight-2Daware-2DGlobally-2DEvenly-2Ddistributed-2DRebalancer&d=DwMFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=OK-6RxrdKOH6KRwDOySNaLx6hy0DI7lQsJgNkY9rapU&m=y8sLBZjx235emP_H8CEdxlyUfhGoxD7ogIhyTUj8qtA&s=hFcMAED5DL1uYTHQNjzaOQ2twDmdmS3-bgpLAbLZnRo&e=>
>>>
>>> Best,
>>>
>>> Junkai
>>>
>>>
>>> On Mon, Feb 8, 2021 at 3:28 PM Phong X. Nguyen <
>>> [email protected]> wrote:
>>>
>>>> I believe it's #2, but perhaps I should explain:
>>>>
>>>> Here's a simplified view of mapFields;
>>>>   "mapFields" : {
>>>>     "partition_11" : {
>>>>       "server05.verizonmedia.com" : "ONLINE"
>>>>     },
>>>>     "partition_22" : {
>>>>       "server05.verizonmedia.com" : "ONLINE"
>>>>     },
>>>> },
>>>>
>>>> Server 5 has partitions (replicas?) 11 and 22 assigned to it; and
>>>> that's currently fine. We could, for example, have partition_17 also
>>>> assigned, which would be fine, but if a fourth one were to be assigned then
>>>> we stand a high likelihood of crashing.
>>>>
>>>> Bootstrapping replicas is also expensive, so we'd like to minimize that
>>>> as well.
>>>>
>>>> On Mon, Feb 8, 2021 at 3:14 PM Xue Junkai <[email protected]> wrote:
>>>>
>>>>> Thanks Phong. Can you clarify which you are looking for?
>>>>> 1. parallel number of state transitions for bootstrapping replicas.
>>>>> 2. number of replicas holding in an instance for limitation.
>>>>>
>>>>> Best,
>>>>>
>>>>> Junkai
>>>>>
>>>>>
>>>>> On Mon, Feb 8, 2021 at 3:06 PM Phong X. Nguyen <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> I'm currently on a project that uses Apache Helix 0.8.4 (with a
>>>>>> pending upgrade to Helix 1.0.1) to distribute partitions across a number 
>>>>>> of
>>>>>> hosts (currently 32 partitions, 16 hosts). Once a partition is allocated 
>>>>>> to
>>>>>> a host a bunch of expensive initialization steps occur, and the
>>>>>> system proceeds to do a bunch of computations for the partition on a
>>>>>> scheduled interval. We seek to minimize initializations when possible.
>>>>>>
>>>>>> If a system goes down (due to either maintenance or failure), the
>>>>>> partitions get reshuffled. Currently we are using
>>>>>> the CrushEdRebalanceStrategy in the hopes of minimizing partition
>>>>>> movements. However, we noticed that unlike the earlier AutoRebalancer
>>>>>> scheme, the CrushEdRebalanceStrategy does not limit the number of
>>>>>> partitions per node. In our case, this can cause severe out-of-memory
>>>>>> issues, which will then cascade as node after node gets more and more
>>>>>> partitions that it cannot handle. We have on rare occasion seen our 
>>>>>> entire
>>>>>> cluster fail as a result, and then our production engineers must 
>>>>>> manually -
>>>>>> and carefully - bring the system back online. This is undesirable.
>>>>>>
>>>>>> Does Helix have a rebalancing strategy that minimizes partition
>>>>>> movement yet also permits enforcement of maximum partitions per node?
>>>>>>
>>>>>> Thanks,
>>>>>> - Phong X. Nguyen
>>>>>>
>>>>>

Re: [E] Re: Maximum Partition enforcement?

Reply via email to