[
https://issues.apache.org/jira/browse/IGNITE-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Daschinsky updated IGNITE-18935:
-------------------------------------
Description:
Step to reproduce
1. Reduce wal history size and wal segment size to 16MB and 8MB respectively,
set checkpoint frequency to 10000
2. Perform heavy load with a lot of entries with TTL 5000 and with eager ttl
enabled
3. Perform deactivation of cluster, stop grid and restart, provided that an
expiration process is active during the process of restart.
{code}
[15:11:58,022][SEVERE][ttl-cleanup-worker-#52%None%][] Critical system error
detected. Will be handled accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is
corrupted [groupId=-459905951, pageIds=[], msg=Runtime failure on bounds:
[lower=null, upper=PendingRow []]]]]
class
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
B+Tree is corrupted [groupId=-459905951, pageIds=[], msg=Runtime failure on
bounds: [lower=null, upper=PendingRow []]]
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:6434)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1294)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1249)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1237)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1232)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:3061)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:3010)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:1213)
at
org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:246)
at
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.lambda$body$0(GridCacheSharedTtlCleanupManager.java:199)
at
java.util.concurrent.ConcurrentHashMap.computeIfPresent(ConcurrentHashMap.java:1769)
at
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:198)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at java.lang.Thread.run(Thread.java:750)
Caused by:
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
java.lang.IllegalStateException: Item not found: 24
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1216)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1276)
... 12 more
Caused by:
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
java.lang.IllegalStateException: Item not found: 24
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.doInitFromLink(CacheDataRowAdapter.java:345)
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:165)
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:136)
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:123)
at
org.apache.ignite.internal.processors.cache.tree.PendingRow.initKey(PendingRow.java:73)
at
org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:128)
at
org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:32)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer0(BPlusTree.java:6115)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.fillFromBuffer(BPlusTree.java:5864)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.init(BPlusTree.java:5790)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1205)
... 13 more
Caused by: java.lang.IllegalStateException: Item not found: 24
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:488)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.getDataOffset(AbstractDataPageIO.java:596)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.readPayload(AbstractDataPageIO.java:638)
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.readIncomplete(CacheDataRowAdapter.java:380)
at
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.doInitFromLink(CacheDataRowAdapter.java:316)
... 23 more
{code}
was:
Step to reproduce
1. Reduce wal history size and wal segment size to 16MB and 8MB respectively,
set checkpoint frequency to 10000
2. Perform heavy load with a lot of entries with TTL 5000 and with eager ttl
enabled
3. Perform deactivation of cluster, stop grid and restart, provided that an
expiration process is active during the process of restart.
{code}
{code}
> Late stopping of TTL workers during deactivation leads to corrupted PDS
> -----------------------------------------------------------------------
>
> Key: IGNITE-18935
> URL: https://issues.apache.org/jira/browse/IGNITE-18935
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.14
> Reporter: Ivan Daschinsky
> Assignee: Ivan Daschinsky
> Priority: Major
> Labels: ise
> Fix For: 2.15
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Step to reproduce
> 1. Reduce wal history size and wal segment size to 16MB and 8MB respectively,
> set checkpoint frequency to 10000
> 2. Perform heavy load with a lot of entries with TTL 5000 and with eager ttl
> enabled
> 3. Perform deactivation of cluster, stop grid and restart, provided that an
> expiration process is active during the process of restart.
> {code}
> [15:11:58,022][SEVERE][ttl-cleanup-worker-#52%None%][] Critical system error
> detected. Will be handled accordingly to configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
> o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is
> corrupted [groupId=-459905951, pageIds=[], msg=Runtime failure on bounds:
> [lower=null, upper=PendingRow []]]]]
> class
> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
> B+Tree is corrupted [groupId=-459905951, pageIds=[], msg=Runtime failure on
> bounds: [lower=null, upper=PendingRow []]]
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:6434)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1294)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1249)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1237)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1232)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:3061)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:3010)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:1213)
> at
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:246)
> at
> org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.lambda$body$0(GridCacheSharedTtlCleanupManager.java:199)
> at
> java.util.concurrent.ConcurrentHashMap.computeIfPresent(ConcurrentHashMap.java:1769)
> at
> org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:198)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
> at java.lang.Thread.run(Thread.java:750)
> Caused by:
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
>
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
> java.lang.IllegalStateException: Item not found: 24
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1216)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1276)
> ... 12 more
> Caused by:
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
> java.lang.IllegalStateException: Item not found: 24
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.doInitFromLink(CacheDataRowAdapter.java:345)
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:165)
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:136)
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:123)
> at
> org.apache.ignite.internal.processors.cache.tree.PendingRow.initKey(PendingRow.java:73)
> at
> org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:128)
> at
> org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:32)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer0(BPlusTree.java:6115)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.fillFromBuffer(BPlusTree.java:5864)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$AbstractForwardCursor.init(BPlusTree.java:5790)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1205)
> ... 13 more
> Caused by: java.lang.IllegalStateException: Item not found: 24
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.findIndirectItemIndex(AbstractDataPageIO.java:488)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.getDataOffset(AbstractDataPageIO.java:596)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.AbstractDataPageIO.readPayload(AbstractDataPageIO.java:638)
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.readIncomplete(CacheDataRowAdapter.java:380)
> at
> org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.doInitFromLink(CacheDataRowAdapter.java:316)
> ... 23 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)