[ https://issues.apache.org/jira/browse/IGNITE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16819593#comment-16819593 ]
Ivan Rakov commented on IGNITE-10344: ------------------------------------- [~Denis Chudov], some comments: 1. FilePageStoreManager: we usually place inner classes at the bottom. 2. FilePageStoreManager: let's mention which problem we are solving with LongOperationAsyncExecutor in javadoc. Also, let's metion contract: next developer should somehow find out that modifications of idxCacheStores should be protected with async executor. 3. What exactly does CleanupRestoredCachesSlowTest#testCleanupSlow test? I anticipate that test would pass without your fix. Maybe, we should try more definitive scenario: - SlowFileIO close hangs on count down latch - We start non-baseline node with non-empty LFS - We check that join exchange completes successfully and cache.put() succeeds when the latch is still not released What do you think? > Speed up cleanupRestoredCaches > ------------------------------ > > Key: IGNITE-10344 > URL: https://issues.apache.org/jira/browse/IGNITE-10344 > Project: Ignite > Issue Type: Improvement > Reporter: Pavel Voronkin > Assignee: Denis Chudov > Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > if (!cctx.kernalContext().clientNode() && !isLocalNodeInBaseline()) > { // Stop all recovered caches and groups. > cctx.cache().onKernalStopCaches(true); cctx.cache().stopCaches(true); > cctx.database().cleanupRestoredCaches(); // Set initial node started marker. > cctx.database().nodeStart(null); } > If we have 100 cache groups we spent a lot of time about 36sec to > cleanupRestoredCaches(). > We need to speed up this phase and add metrics on this. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)