[ 
https://issues.apache.org/jira/browse/IGNITE-20310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776121#comment-17776121
 ] 

Mirza Aliev edited comment on IGNITE-20310 at 12/5/23 7:54 AM:
---------------------------------------------------------------

as long as this ticket is blocked by 
https://issues.apache.org/jira/browse/IGNITE-20477, we could implement easier 
fix with ms.invoke().get()


was (Author: maliev):
as long as this ticket is blocked by 
https://issues.apache.org/jira/browse/IGNITE-20477, we could implement easier 
fix with ms.invoke().join()

> Meta storage invokes are not completed  when DZM start is completed
> -------------------------------------------------------------------
>
>                 Key: IGNITE-20310
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20310
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Uttsel
>            Assignee: Mirza Aliev
>            Priority: Major
>              Labels: dzm-reviewed, ignite-3
>
> h3. *Motivation*
> There are meta storage invokes in DistributionZoneManager start. Currently it 
> does the meta storage invokes in 
> DistributionZoneManager#createOrRestoreZoneState:
> # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to init 
> the default zone.
> # DistributionZoneManager#restoreTimers in case when a filter update was 
> handled before DZM stop, but it didn't update data nodes.
> Futures of these invokes are ignored. So after the start method is completed 
> actually not all start actions are completed. It can lead to the following 
> situation: 
> * Initialisation of the default zone is hanged for some reason even after 
> full restart of the cluster.
> * That means that all data nodes related keys in metastorage haven't been 
> initialised.
> * For example, if user add some new node, and scale up timer is immediate, 
> which leads to immediate data nodes recalculation, this recalculation won't 
> happen, because data nodes key have not been initialised. 
> h3. *Possible solutions*
> h4. Easier
> We just need to wait for all async logic to be completed within the 
> {{DistributionZoneManager#start}} with {{ms.invoke().get()}}
> h4. Harder
> We can enhance {{IgniteComponent#start}}, so it could return Completable 
> future, and after that we need to change the flow of starting components, so 
> node is not ready to work until all {{IgniteComponent#start}} futures are 
> completed. For example, we can chain our futures on 
> {{IgniteImpl#recoverComponentsStateOnStart}}, so components' futures are 
> completed before {{metaStorageMgr.deployWatches()}}.
>  In {{DistributionZoneManager#start}}  we can return 
> {{CompletableFuture.allOf}} features, that are needed to be completed in the 
> {{DistributionZoneManager#start}}
> h3. *Definition of done*
> All asynchronous logic in the {{DistributionZoneManager#start}} is done 
> before a node is ready to work, in particular, ready to interact with zones.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to