[
https://issues.apache.org/jira/browse/IGNITE-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Steshin reassigned IGNITE-19904:
-----------------------------------------
Assignee: Vladimir Steshin
> Assertion in defragmentation
> ----------------------------
>
> Key: IGNITE-19904
> URL: https://issues.apache.org/jira/browse/IGNITE-19904
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.12
> Reporter: Vladimir Steshin
> Assignee: Vladimir Steshin
> Priority: Major
> Labels: ise
> Attachments: default-config.xml, failure2.16_with_thread_dump.log,
> failure_with_root_npe_cause.log, ignite.log, jvm.opts
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Defragmentaion fails with:
> {code:java}
> java.lang.AssertionError: Invalid state. Type is 0! pageId = 0001000d00024cbf
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.copyPageForCheckpoint(PageMemoryImpl.java:1359)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1277)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> {code}
> Difficult to write a test. Can't reproduce on my computers :(. Flackly
> appears on a server (4 core x 4 cpu) with 100G of the test cache data and
> million+ pages to checkpoint during defragmentation. More often, this occurs
> with pageSize 1024 (to produce more pages).
> Regarding my diagnostic build, I suppose that a fresh, empty page is caught
> in defragmentation. Here is a page dump with test-expented PAGE_OVERHEAD
> (=64) and same error a bit before copyPageForCheckpoint():
> {code:java}
> org.apache.ignite.IgniteException: Wrong page type in checkpointWritePage1.
> Page: Data region = 'defragPartitionsDataRegion'.
> FullPageId [pageId=281878703760205, effectivePageId=403727049549,
> grpId=-1368047378].
> PageDump = page_id: 281878703760205, rel_id: 48603, cache_id: -1368047378,
> pin: 0, lock: 65536, tmp_buf: 72057594037927935, test_val: 1. data_hex:
> 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1240)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150)
> ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT]
> {code}
> 'test_val' is my diagnostic page prefix extension. Various numbers are
> assigned where tmp_buf is assigned (by `PageHeader::tempBufferPointer()`).
> '1' comes from PageHeader::initNew(long absPtr, long relative):
> {code:java}
> public static void PageHeader::initNew(long absPtr, long relative) {
> relative(absPtr, relative);
> tempBufferPointer(absPtr, PageMemoryImpl.INVALID_REL_PTR);
> GridUnsafe.putLong(absPtr, PAGE_MARKER);
> GridUnsafe.putInt(absPtr + PAGE_PIN_CNT_OFFSET, 0);
> // For diagnostic purposes.
> PageUtils.setTestTmpValue(absPtr, 1);
> }
>
> public static void PageUtils::setTestTmpValue(long absPtr, int val) {
> // Next to page overhead
> putInt(absPtr, PAGE_OVERHEAD - 4, val);
> }
>
> public static final int PageMemoryImpl#PAGE_OVERHEAD = 64;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)