[ 
https://issues.apache.org/jira/browse/IGNITE-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16703044#comment-16703044
 ] 

Sergey Chugunov commented on IGNITE-10058:
------------------------------------------

[~xtern],

I left a comment in GitHub PR, could you please take a look?

Overall patch looks reasonable for me: as I understand, changes in 
AffinityManager enable to apply affinity changes when new node joins and LOST 
partitions are presented in the grid, and changes in PartitionTolopogyImpl 
trigger logic to evict extra partitions after resetLostPartitions is called.

[~ilantukh], [~agoncharuk], what do you guys think about this logic? Maybe 
there is a better way to avoid the situation with extra partitions after 
resetLost is called?

> resetLostPartitions() leaves an additional copy of a partition in the cluster
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-10058
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10058
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Stanislav Lukyanov
>            Assignee: Pavel Pereslegin
>            Priority: Major
>
> If there are several copies of a LOST partition, resetLostPartitions() will 
> leave all of them in the cluster as OWNING.
> Scenario:
> 1) Start 4 nodes, a cache with backups=0 and READ_WRITE_SAFE, fill the cache
> 2) Stop one node - some partitions are recreated on the remaining nodes as 
> LOST
> 3) Start one node - the LOST partitions are being rebalanced to the new node 
> from the existing ones
> 4) Wait for rebalance to complete
> 5) Call resetLostPartitions()
> After that the partitions that were LOST become OWNING on all nodes that had 
> them. Eviction of these partitions doesn't start.
> Need to correctly evict additional copies of LOST partitions either after 
> rebalance on step 4 or after resetLostPartitions() call on step 5.
> Current resetLostPartitions() implementation does call checkEvictions(), but 
> the ready affinity assignment contains several nodes per partition for some 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to