Vladimir Steshin created IGNITE-19904:
-----------------------------------------
Summary: Assertion in defragmentation
Key: IGNITE-19904
URL: https://issues.apache.org/jira/browse/IGNITE-19904
Project: Ignite
Issue Type: Bug
Reporter: Vladimir Steshin
Defragmentaion fails with:
{code:java}
java.lang.AssertionError: Invalid state. Type is 0! pageId = 0001000d00024cbf
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.copyPageForCheckpoint(PageMemoryImpl.java:1359)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1277)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
{code}
Difficult to write a test. Can't reproduce on my computers :(. Flackly appears
on a server (4 core x 4 cpu) with 100G of the test cache data and million+
pages to checkpoint during defragmentation. More often, this occurs with
pageSize 1024 (to produce more pages).
Regarding my diagnostic build, I suppose that a fresh, empty page is caught in
defragmentation. Here is a page dump with test-expented PAGE_OVERHEAD (=64) and
same error a bit before copyPageForCheckpoint():
{code:java}
org.apache.ignite.IgniteException: Wrong page type in checkpointWritePage1.
Page: Data region = 'defragPartitionsDataRegion'.
FullPageId [pageId=281878703760205, effectivePageId=403727049549,
grpId=-1368047378].
PageDump = page_id: 281878703760205, rel_id: 48603, cache_id: -1368047378,
pin: 0, lock: 65536, tmp_buf: 72057594037927935, test_val: 1. data_hex:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1240)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150)
~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
{code}
'test_val' is my diagnostic page prefix extension. Various numbers are assigned
where tmp_buf is assigned. '1' comes from PageHeader::initNew(long absPtr, long
relative):
{code:java}
public static void PageHeader::initNew(long absPtr, long relative) {
relative(absPtr, relative);
tempBufferPointer(absPtr, PageMemoryImpl.INVALID_REL_PTR);
GridUnsafe.putLong(absPtr, PAGE_MARKER);
GridUnsafe.putInt(absPtr + PAGE_PIN_CNT_OFFSET, 0);
// For diagnostic purposes.
PageUtils.setTestTmpValue(absPtr, 1);
}
public static void PageUtils::setTestTmpValue(long absPtr, int val) {
// Next to page overhead
putInt(absPtr, PAGE_OVERHEAD - 4, val);
}
public static final int PageMemoryImpl#PAGE_OVERHEAD = 64;
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)