[
https://issues.apache.org/jira/browse/IGNITE-12557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020862#comment-17020862
]
Aleksey Plekhanov commented on IGNITE-12557:
--------------------------------------------
[~ascherbakov], ok, thank you.
My PR was created only to check a naive approach, it's not a final fix (there
also should be failover implemented in case of node crash while stopping cache).
Waiting for your contribution.
Feel free to assign ticket to yourself.
> Destroy of big cache which is not only cache in cache group causes IgniteOOME
> -----------------------------------------------------------------------------
>
> Key: IGNITE-12557
> URL: https://issues.apache.org/jira/browse/IGNITE-12557
> Project: Ignite
> Issue Type: Bug
> Components: persistence
> Reporter: Aleksey Plekhanov
> Assignee: Aleksey Plekhanov
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When {{destroyCache()}} is invoked {{checkpointReadLock}} is held by exchange
> thread during all time cache entries are cleaning. Meanwhile,
> {{db-checkpoint-thread}} can't acquire checkpoint write lock and can't start
> checkpoint. After some time all page-memory has filled with dirty pages and
> attempt to acquire a new page causes IgniteOOM exception:
> {noformat}
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Failed to
> find a page for eviction [segmentCapacity=40485, loaded=15881,
> maxDirtyPages=11910, dirtyPages=15881, cpPages=0, pinnedInSegment=0,
> failedToPrepare=15881]
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.tryToFindSequentially(PageMemoryImpl.java:2420)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.removePageForReplacement(PageMemoryImpl.java:2314)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:743)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:679)
> at
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:158)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.acquirePage(BPlusTree.java:5872)
> at
> org.apache.ignite.internal.processors.cache.tree.CacheDataTree.compareKeys(CacheDataTree.java:435)
> at
> org.apache.ignite.internal.processors.cache.tree.CacheDataTree.compare(CacheDataTree.java:384)
> at
> org.apache.ignite.internal.processors.cache.tree.CacheDataTree.compare(CacheDataTree.java:63)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.compare(BPlusTree.java:5214)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findInsertionPoint(BPlusTree.java:5134)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Search.run0(BPlusTree.java:298)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:5723)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Search.run(BPlusTree.java:278)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:5709)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.readPage(PageHandler.java:169)
> at
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.read(DataStructure.java:364)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.read(BPlusTree.java:5910)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.removeDown(BPlusTree.java:2077)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:2007)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.removex(BPlusTree.java:1838)
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.clear(IgniteCacheOffheapManagerImpl.java:2963)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.clear(GridCacheOffheapManager.java:2611)
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.removeCacheData(IgniteCacheOffheapManagerImpl.java:296)
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.stopCache(IgniteCacheOffheapManagerImpl.java:258)
> at
> org.apache.ignite.internal.processors.cache.CacheGroupContext.stopCache(CacheGroupContext.java:825)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCache(GridCacheProcessor.java:1070)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStop(GridCacheProcessor.java:2617)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStop(GridCacheProcessor.java:2596)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$processCacheStopRequestOnExchangeDone$629e8679$1(GridCacheProcessor.java:2796)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11173)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11075)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCacheStopRequestOnExchangeDone(GridCacheProcessor.java:2761)
> at
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onExchangeDone(GridCacheProcessor.java:2918)
> {noformat}
> Reproducer:
> {code:java}
> @Override protected IgniteConfiguration getConfiguration(String
> igniteInstanceName) throws Exception {
> IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
> cfg.setDataStorageConfiguration(new DataStorageConfiguration()
> .setDefaultDataRegionConfiguration(new DataRegionConfiguration()
> .setPersistenceEnabled(true)
> .setMaxSize(256L * 1024 * 1024)
> ));
> cfg.setCacheConfiguration(
> new CacheConfiguration(DEFAULT_CACHE_NAME).setGroupName("grp"),
> new CacheConfiguration("another_cache").setGroupName("grp")
> );
> return cfg;
> }
> @Test
> public void testDestroyCache() throws Exception {
> IgniteEx ignite = startGrid(0);
> ignite.cluster().active(true);
> try (IgniteDataStreamer<Object, Object> streamer2 =
> ignite.dataStreamer(DEFAULT_CACHE_NAME)) {
> PageMemoryEx pageMemory =
> (PageMemoryEx)ignite.cachex(DEFAULT_CACHE_NAME).context().dataRegion().pageMemory();
> long totalPages = pageMemory.totalPages();
> for (int i = 0; i <= totalPages; i++)
> streamer2.addData(i, new byte[pageMemory.pageSize() / 2]);
> }
> ignite.destroyCache(DEFAULT_CACHE_NAME);
> }
> {code}
> Checkpoint read lock in exchange thread acquired before
> {{GridCacheProcessor#prepareCacheStop(java.lang.String, boolean)}} method and
> inside {{GridCacheOffheapManager.GridCacheDataStore#clear()}} method.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)