[
https://issues.apache.org/jira/browse/IGNITE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472351#comment-16472351
]
ASF GitHub Bot commented on IGNITE-8320:
----------------------------------------
GitHub user Jokser opened a pull request:
https://github.com/apache/ignite/pull/3985
IGNITE-8320 Corrupted indexes fix
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gridgain/apache-ignite ignite-8320-reproduce
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/3985.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3985
----
commit ffb362929decc431b325bccc8c612a049f85063f
Author: Pavel Kovalenko <jokserfn@...>
Date: 2018-05-11T15:32:32Z
IGNITE-8320 Reproducer.
commit 37765277286a18198255bcbc2286073706ef6048
Author: Pavel Kovalenko <jokserfn@...>
Date: 2018-05-11T15:33:42Z
IGNITE-8320 Reproducer.
commit d1d265ae98ab79d6d80a667e4a844ea86f724e32
Author: Pavel Kovalenko <jokserfn@...>
Date: 2018-05-11T15:34:47Z
IGNITE-8320 Docs fix.
commit 234b1f8fcf24d849227e5e73e26fb81e0768cf21
Author: Pavel Kovalenko <jokserfn@...>
Date: 2018-05-11T15:36:01Z
IGNITE-8320 Docs fix.
commit 951d67e93677358470416a5faabe238b6e2bb21a
Author: Pavel Kovalenko <jokserfn@...>
Date: 2018-05-11T16:46:15Z
IGNITE-8320 Fix WIP.
commit a1acab629dfce81e904bdc6fac92458b60a7ac48
Author: Pavel Kovalenko <jokserfn@...>
Date: 2018-05-11T17:51:00Z
IGNITE-8320 Fix WIP.
----
> Page corruption during the rebalancing cache.
> ---------------------------------------------
>
> Key: IGNITE-8320
> URL: https://issues.apache.org/jira/browse/IGNITE-8320
> Project: Ignite
> Issue Type: Bug
> Components: persistence
> Affects Versions: 2.4
> Reporter: Vyacheslav Koptilin
> Assignee: Pavel Kovalenko
> Priority: Major
> Fix For: 2.6
>
>
> Cache rebalance may result in page memory corruption.
> {noformat}
> [2018-04-18T14:33:23,260][ERROR][sys-#54][GridCacheIoManager] Failed
> processing message [senderId=95f06c25-e6bb-48f7-a3e5-4c05fc1c49be,
> msg=GridDhtPartitionSupplyMessage [rebalanceId=37,
> topVer=AffinityTopologyVersion [topVer=53, minorTopVer=1], missed=null,
> clean=null, msgSize=525350, estimatedKeysCnt=1690216, size=2, parts=[1, 2],
> super=GridCacheGroupIdMessage [grpId=-1831596270]]]
> org.apache.ignite.IgniteException: Runtime failure on row: Row@33b6805c[
> key: xxxx [idHash=773709078, hash=-630455542, ...], val: xxxx
> [idHash=1309051286, hash=-1321165334, ver: GridCacheVersion
> [topVer=135435024, order=1523963943331, nodeOrder=4] ]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2102)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putx(BPlusTree.java:2049)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:247)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:454)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:653)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1866)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:407)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:1391)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1255)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1451)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:352)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:3527)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.initialValue(GridCacheMapEntry.java:2735)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.preloadEntry(GridDhtPartitionDemander.java:823)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.handleSupplyMessage(GridDhtPartitionDemander.java:704)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleSupplyMessage(GridDhtPreloader.java:347)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:365)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:355)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1054)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:99)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1603)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:126)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2751)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1515)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:126)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$10.run(GridIoManager.java:1484)
> [ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_151]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_151]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
> Caused by: java.lang.IllegalStateException: Failed to get page IO instance
> (page content is corrupted)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forVersion(IOVersions.java:83)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forPage(IOVersions.java:95)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:148)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2RowFactory.getRow(H2RowFactory.java:61)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.createRowFromLink(H2Tree.java:149)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.io.H2LeafIO.getLookupRow(H2LeafIO.java:67)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.io.H2LeafIO.getLookupRow(H2LeafIO.java:33)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(H2Tree.java:167)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.getRow(H2Tree.java:46)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.getRow(BPlusTree.java:4436)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.compare(H2Tree.java:209)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.query.h2.database.H2Tree.compare(H2Tree.java:46)
> ~[ignite-indexing-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.compare(BPlusTree.java:4423)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findInsertionPoint(BPlusTree.java:4343)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$1500(BPlusTree.java:82)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Search.run0(BPlusTree.java:270)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:4770)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:4755)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.readPage(PageHandler.java:158)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.read(DataStructure.java:320)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2317)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2329)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2069)
> ~[ignite-core-2.4.4.b1.jar:2.4.4.b1]
> ... 30 more
> {noformat}
> Possible cause and reproducer:
> 1) Start partition eviction
> 2) Force kill node (kill -9) after partition file truncate
> 3) Start node again and iterate over index
> The main problem that file truncation is not synchronized with actual
> checkpoint which can lead to the situation, that after crash recovery we have
> links in index tree to the data pages which were already removed during file
> truncation.
> One of the possible solutions is to mark such partition files for deletion
> and safely truncate them on the next checkpoint.
> This mechanism can be ressurected from ignite-2.0.2.b1 branch.
> See
> {noformat}
> org/gridgain/grid/internal/processors/cache/database/GridCacheDatabaseSharedManager.java:3059
> org.gridgain.grid.cache.db.GridCacheOffheapManager#destroyCacheDataStore
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)