[ 
https://issues.apache.org/jira/browse/IGNITE-23776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-23776:
---------------------------------
    Description: 
h3. Motivation

When an event that triggers rebalancing is processed, we need to ensure that 
this process is completed. We could use {{pendingChangeTriggerKey}} and track 
when it is changed, but there are cases when updating of this key is skipped, 
for example if the assignments already match.

We need to update {{pendingChangeTriggerKey}} with more up to date value any 
time we try to process new rebalance event.
h3. Implementation notes.

There are several places where we change {{{}pendingChangeTriggerKey{}}}: 
{{{}disaster.GroupUpdateRequest#prepareMsInvokeClosure{}}}, 
{{RebalanceUtil#updatePendingAssignmentsKeys}} and 
{{RebalanceUtilEx#handleReduceChanged}}

For all cases condition for invoke must be improved so we do not skip 
{{pendingChangeTriggerKey}} update if we have the same value of the assignments.

Note, that if we do not pass the general condition of 
{{pendingChangeTriggerKey}}
{noformat}
value(changeTriggerKey) < revision
{noformat}
we must not update this key with stale value.

For example, improved condition for 
{{RebalanceUtil#updatePendingAssignmentsKeys}} could look like this:
{noformat}
        //    if empty(partition.change.trigger.revision) || 
partition.change.trigger.revision < event.revision:
        //        if empty(partition.assignments.pending)
        //              && ((isNewAssignments && 
empty(partition.assignments.stable))
        //                  || (partition.assignments.stable != 
calcPartAssignments() && !empty(partition.assignments.stable))):
        //            partition.assignments.pending = calcPartAssignments()
        //            partition.change.trigger.revision = event.revision
        //        else:
        //            if partition.assignments.pending != calcPartAssignments 
&& !empty(partition.assignments.pending)
        //                partition.assignments.planned = calcPartAssignments()
        //                partition.change.trigger.revision = event.revision
        //            else if partition.assignments.pending == 
calcPartAssignments
        //                remove(partition.assignments.planned)
        //                partition.change.trigger.revision = event.revision  
        //                message after the metastorage invoke:
        //                "Remove planned key because current pending key has 
the same value."
        //            else if empty(partition.assignments.pending)
        //                remove(partition.assignments.planned)
        //                partition.change.trigger.revision = event.revision  
        //                message after the metastorage invoke:
        //                "Remove planned key because pending is empty and 
calculated assignments are equal to current assignments."
        //    else:
        //        skip
{noformat}
Note that all interaction for {{ZoneRebalanceUtil#pendingChangeTriggerKey}} 
also must be improved.

Also as a refactoring we could change representation of 
{{pendingChangeTriggerKey}} so it could conform the general pattern of 
metastorage keys "prefix + unique" identificator, not vice versa.
{code:java}
    public static ByteArray pendingChangeTriggerKey(TablePartitionId partId) {
        return new ByteArray("pending.change.trigger" + partId);
    }
{code}
h3. Definition of done
 * {{RebalanceUtil.pendingChangeTriggerKey}} is updated with more up to date 
value any time we try to process new rebalance event.
 * The same must be applied to {{ZoneRebalanceUtil.pendingChangeTriggerKey}}

  was:
h3. Motivation

When an event that triggers rebalancing is processed, we need to ensure that 
this process is completed. We could use {{pendingChangeTriggerKey}} and track 
when it is changed, but there are cases when updating of this key is skipped, 
for example if the assignments already match.

We need to update {{pendingChangeTriggerKey}} with more up to date value any 
time we try to process new rebalance event. 

h3. Implementation notes.

There are several places where we change {{pendingChangeTriggerKey}}: 
{{disaster.GroupUpdateRequest#prepareMsInvokeClosure}}, 
{{RebalanceUtil#updatePendingAssignmentsKeys}} and 
{{RebalanceUtilEx#handleReduceChanged}}

For all cases condition for invoke must be improved so we do not skip  
{{pendingChangeTriggerKey}} update if we have the same value of the assignments.

Note, that if we do not pass the general condition of 
{{pendingChangeTriggerKey}}

{noformat}
value(changeTriggerKey) < revision
{noformat}

we must not update this key with stale value.

For example, improved condition for 
{{RebalanceUtil#updatePendingAssignmentsKeys}} could look like this:


{noformat}
        //    if empty(partition.change.trigger.revision) || 
partition.change.trigger.revision < event.revision:
        //        if empty(partition.assignments.pending)
        //              && ((isNewAssignments && 
empty(partition.assignments.stable))
        //                  || (partition.assignments.stable != 
calcPartAssignments() && !empty(partition.assignments.stable))):
        //            partition.assignments.pending = calcPartAssignments()
        //            partition.change.trigger.revision = event.revision
        //        else:
        //            if partition.assignments.pending != calcPartAssignments 
&& !empty(partition.assignments.pending)
        //                partition.assignments.planned = calcPartAssignments()
        //                partition.change.trigger.revision = event.revision
        //            else if partition.assignments.pending == 
calcPartAssignments
        //                remove(partition.assignments.planned)
        //                partition.change.trigger.revision = event.revision  
        //                message after the metastorage invoke:
        //                "Remove planned key because current pending key has 
the same value."
        //            else if empty(partition.assignments.pending)
        //                remove(partition.assignments.planned)
        //                partition.change.trigger.revision = event.revision  
        //                message after the metastorage invoke:
        //                "Remove planned key because pending is empty and 
calculated assignments are equal to current assignments."
        //            else:
        //                partition.change.trigger.revision = event.revision
        //    else:
        //        skip
{noformat}

Note that all interaction for {{ZoneRebalanceUtil#pendingChangeTriggerKey}} 
also must be improved.

Also as a refactoring we could change representation of 
{{pendingChangeTriggerKey}} so it could conform the general pattern of 
metastorage keys "prefix + unique" identificator, not vice versa. 

{code:java}
    public static ByteArray pendingChangeTriggerKey(TablePartitionId partId) {
        return new ByteArray("pending.change.trigger" + partId);
    }
{code}

h3. Definition of done
* {{RebalanceUtil.pendingChangeTriggerKey}} is updated with more up to date 
value any time we try to process new rebalance event. 
* The same must be applied to {{ZoneRebalanceUtil.pendingChangeTriggerKey}}



> Any time rebalance is scheduled  pendingChangeTriggerKey must be updated
> ------------------------------------------------------------------------
>
>                 Key: IGNITE-23776
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23776
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Mirza Aliev
>            Assignee: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> When an event that triggers rebalancing is processed, we need to ensure that 
> this process is completed. We could use {{pendingChangeTriggerKey}} and track 
> when it is changed, but there are cases when updating of this key is skipped, 
> for example if the assignments already match.
> We need to update {{pendingChangeTriggerKey}} with more up to date value any 
> time we try to process new rebalance event.
> h3. Implementation notes.
> There are several places where we change {{{}pendingChangeTriggerKey{}}}: 
> {{{}disaster.GroupUpdateRequest#prepareMsInvokeClosure{}}}, 
> {{RebalanceUtil#updatePendingAssignmentsKeys}} and 
> {{RebalanceUtilEx#handleReduceChanged}}
> For all cases condition for invoke must be improved so we do not skip 
> {{pendingChangeTriggerKey}} update if we have the same value of the 
> assignments.
> Note, that if we do not pass the general condition of 
> {{pendingChangeTriggerKey}}
> {noformat}
> value(changeTriggerKey) < revision
> {noformat}
> we must not update this key with stale value.
> For example, improved condition for 
> {{RebalanceUtil#updatePendingAssignmentsKeys}} could look like this:
> {noformat}
>         //    if empty(partition.change.trigger.revision) || 
> partition.change.trigger.revision < event.revision:
>         //        if empty(partition.assignments.pending)
>         //              && ((isNewAssignments && 
> empty(partition.assignments.stable))
>         //                  || (partition.assignments.stable != 
> calcPartAssignments() && !empty(partition.assignments.stable))):
>         //            partition.assignments.pending = calcPartAssignments()
>         //            partition.change.trigger.revision = event.revision
>         //        else:
>         //            if partition.assignments.pending != calcPartAssignments 
> && !empty(partition.assignments.pending)
>         //                partition.assignments.planned = 
> calcPartAssignments()
>         //                partition.change.trigger.revision = event.revision
>         //            else if partition.assignments.pending == 
> calcPartAssignments
>         //                remove(partition.assignments.planned)
>         //                partition.change.trigger.revision = event.revision  
>         //                message after the metastorage invoke:
>         //                "Remove planned key because current pending key has 
> the same value."
>         //            else if empty(partition.assignments.pending)
>         //                remove(partition.assignments.planned)
>         //                partition.change.trigger.revision = event.revision  
>         //                message after the metastorage invoke:
>         //                "Remove planned key because pending is empty and 
> calculated assignments are equal to current assignments."
>         //    else:
>         //        skip
> {noformat}
> Note that all interaction for {{ZoneRebalanceUtil#pendingChangeTriggerKey}} 
> also must be improved.
> Also as a refactoring we could change representation of 
> {{pendingChangeTriggerKey}} so it could conform the general pattern of 
> metastorage keys "prefix + unique" identificator, not vice versa.
> {code:java}
>     public static ByteArray pendingChangeTriggerKey(TablePartitionId partId) {
>         return new ByteArray("pending.change.trigger" + partId);
>     }
> {code}
> h3. Definition of done
>  * {{RebalanceUtil.pendingChangeTriggerKey}} is updated with more up to date 
> value any time we try to process new rebalance event.
>  * The same must be applied to {{ZoneRebalanceUtil.pendingChangeTriggerKey}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to