[jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

Alexei Scherbakov (Jira) Tue, 10 Dec 2019 08:12:37 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992697#comment-16992697
 ]


Alexei Scherbakov commented on IGNITE-9913:
-------------------------------------------

[~avinogradov]

I've reviewed a PR. Overall the idea and the implementation looks valid.

My questions are:

1. You've introduced a flag _rebalanced_ indicating the previous exchange 
future was completed after everything is rebalanced.
Seems the flag is not necessary. The _rebalanced_ state can be figured out by 
the conditions:
a) this exchange is triggered by CacheAffinityChangeMessage 
b) for this exchange forceAffReassignment=true and 
GridDhtPartitionsFullMessage#idealAffinityDiff().isEmpty()

Can we get rid of the flag ?

2. It seems CacheAffinityChangeMessage is no longer contains any useful 
assignments when is triggered by switching from late to ideal state.
Can we get rid of sending any assignments for protocol v3 ?

Also could you add a test when all owners of the partition are left one by one 
under the load and make sure updates to other partitions work as expected 
without PME, using different loss policy modes and backups number ?

> Prevent data updates blocking in case of backup BLT server node leave
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-9913
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9913
>             Project: Ignite
>          Issue Type: Improvement
>          Components: general
>            Reporter: Ivan Rakov
>            Assignee: Anton Vinogradov
>            Priority: Major
>         Attachments: 9913_yardstick.png, master_yardstick.png
>
>          Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

Reply via email to