[ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947584#comment-16947584
 ] 

Alexey Goncharuk commented on IGNITE-9913:
------------------------------------------

[~NSAmelchev], [~avinogradov], a few comments for the PR:
 * The {{localRecoveryNeeded}} does not seem right - you check the list of 
partitions from affinity assignment cache. There may be a case when a node 
still owns a partition, but it is not an assigned backup for this partition 
(this will happen right after late affinity assignment change, when affinity 
cache is changed to an ideal assignment, but the node did not yet RENTed a 
partition). In this case, the partition will not be reported in the list of 
partitions and recovery will be skipped
 * Do I understand correctly that *all* new transactions will still wait for 
this optimized PME to complete? If yes, what is the actual time boost that this 
change gives? Do you have any benchmark numbers? If no, how do you order 
transactions on a new primary node with the backup transactions on the same 
node that did not finish recovery yet?

> Prevent data updates blocking in case of backup BLT server node leave
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-9913
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9913
>             Project: Ignite
>          Issue Type: Improvement
>          Components: general
>            Reporter: Ivan Rakov
>            Assignee: Anton Vinogradov
>            Priority: Major
>             Fix For: 2.8
>
>         Attachments: 9913_yardstick.png, master_yardstick.png
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to