[ 
https://issues.apache.org/jira/browse/IGNITE-26952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18036778#comment-18036778
 ] 

Roman Puchkovskiy commented on IGNITE-26952:
--------------------------------------------

The patch looks good to me

> Fix race between starting and cancelling IncomingSnapshotCopier
> ---------------------------------------------------------------
>
>                 Key: IGNITE-26952
>                 URL: https://issues.apache.org/jira/browse/IGNITE-26952
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Filipp Shergalis
>            Assignee: Filipp Shergalis
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When cancellling incoming snapshot, we cancel rebalance future:
>  
> {code:java}
> busyLock.block();
> LOG.info("Copier is canceled for partition [{}]", createPartitionInfo());
> // Cancel all futures that might be upstream wrt joinFuture.
> List<CompletableFuture<?>> futuresToCancel = Stream.of(snapshotMetaFuture, 
> rebalanceFuture)
>         .filter(Objects::nonNull)
>         .collect(toList());
> futuresToCancel.forEach(future -> future.cancel(false)); {code}
> completeRebalance is a next stage after rebalanceFuture:
>  
> {code:java}
> return rebalanceFuture
>         .handleAsync((v, throwable) -> completeRebalance(snapshotContext, 
> throwable), executor) {code}
> So when future is cancelled, handleAsync starts completing rebalance 
> immediately, not after rebalanceFuture logic actually finished. If it happens 
> during startRebalance execution, we have race with startRebalance method and 
> completeRebalance



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to