[
https://issues.apache.org/jira/browse/IGNITE-11367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Rakov updated IGNITE-11367:
--------------------------------
Description:
I've discovered some issues in PageMemoryTracker while debugging IGNITE-10873:
1) Mock page memory doesn't implement PageMemoryImpl#pageBuffer. As a result,
some delta records (which applies changes via buffer) can't be applied. Example:
{code:java}
Caused by: java.lang.NullPointerException
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.getLastSnapshotTag0(TrackingPageIO.java:235)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.getLastSnapshotTag(TrackingPageIO.java:227)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.validateSnapshotTag(TrackingPageIO.java:135)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.markChanged(TrackingPageIO.java:93)
at
org.apache.ignite.internal.pagemem.wal.record.delta.TrackingPageDeltaRecord.applyDelta(TrackingPageDeltaRecord.java:75)
at
org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker.applyWalRecord(PageMemoryTracker.java:447)
at
org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker.access$000(PageMemoryTracker.java:81)
at
org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker$1.log(PageMemoryTracker.java:159)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.GridCacheSnapshotManager.onChangeTrackerPage(GridCacheSnapshotManager.java:2801)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$5.applyx(GridCacheDatabaseSharedManager.java:1084)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$5.applyx(GridCacheDatabaseSharedManager.java:1077)
at
org.apache.ignite.internal.util.lang.GridInClosure3X.apply(GridInClosure3X.java:34)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1572)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:495)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:487)
at
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:394)
at
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:369)
at
org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:285)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11500(BPlusTree.java:92)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryReplace(BPlusTree.java:3638)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2565)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2293)
... 33 more
{code}
2) During binary recovery phase, page memory is changed by applying delta
records and page snapshots (GridCacheDatabaseSharedManager#applyPageDelta,
#applyPageSnapshot). Such changes are not replicated by logging delta records
to WAL (we don't log physical records on binary recovery - we just apply
already logged ones). This leads to false positive broken consistency reports.
To prevent this, we should apply changes to both regular page memory and mock
page memory in PageMemoryTracker.
3) PagesList.java:918:
{code:java}
// Here we should never write full page, because it is
known to be new.
if (needWalDeltaRecord(nextId, nextPage, FALSE))
wal.log(new PagesListInitNewPageRecord(
grpId,
nextId,
io.getType(),
io.getVersion(),
nextId,
prevId,
0L
));
{code}
Sometimes we may log InitNewPageRecord without logging page snapshot. In case
page was recycled before, its content in unused page part may trigger false
positive reports. We may fix this by filling page content with zeros during
PagesList#setupNextPage call.
Modified test from attached patch should pass if all mentioned issues will be
fixed. Please ensure that both Mockito and ignite-indexing are in classpath
when you run the test.
was:
I've discovered some issues in PageMemoryTracker while debugging IGNITE-10873:
1) Mock page memory doesn't implement PageMemoryImpl#pageBuffer. As a result,
some delta records (which applies changes via buffer) can't be applied. Example:
{code:java}
Caused by: java.lang.NullPointerException
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.getLastSnapshotTag0(TrackingPageIO.java:235)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.getLastSnapshotTag(TrackingPageIO.java:227)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.validateSnapshotTag(TrackingPageIO.java:135)
at
org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.markChanged(TrackingPageIO.java:93)
at
org.apache.ignite.internal.pagemem.wal.record.delta.TrackingPageDeltaRecord.applyDelta(TrackingPageDeltaRecord.java:75)
at
org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker.applyWalRecord(PageMemoryTracker.java:447)
at
org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker.access$000(PageMemoryTracker.java:81)
at
org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker$1.log(PageMemoryTracker.java:159)
at
org.gridgain.grid.internal.processors.cache.database.snapshot.GridCacheSnapshotManager.onChangeTrackerPage(GridCacheSnapshotManager.java:2801)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$5.applyx(GridCacheDatabaseSharedManager.java:1084)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$5.applyx(GridCacheDatabaseSharedManager.java:1077)
at
org.apache.ignite.internal.util.lang.GridInClosure3X.apply(GridInClosure3X.java:34)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1572)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:495)
at
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:487)
at
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:394)
at
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:369)
at
org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:285)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11500(BPlusTree.java:92)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryReplace(BPlusTree.java:3638)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2565)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2293)
... 33 more
{code}
2) During binary recovery phase, page memory is changed with applying delta
records and page snapshots (GridCacheDatabaseSharedManager#applyPageDelta,
#applyPageSnapshot). Such changes are not replicated by logging delta records
to WAL (we don't log physical records on binary recovery - we just apply
already logged ones). This leads to false positive broken consistency reports.
To prevent this, we should apply changes to both regular page memory and mock
page memory in PageMemoryTracker.
3) PagesList.java:918:
{code:java}
// Here we should never write full page, because it is
known to be new.
if (needWalDeltaRecord(nextId, nextPage, FALSE))
wal.log(new PagesListInitNewPageRecord(
grpId,
nextId,
io.getType(),
io.getVersion(),
nextId,
prevId,
0L
));
{code}
Sometimes we may log InitNewPageRecord without logging page snapshot. In case
page was recycled before, its content in unused page part may trigger false
positive reports. We may fix this by filling page content with zeros during
PagesList#setupNextPage call.
Modified test from attached patch should pass if all mentioned issues will be
fixed. Please ensure that both Mockito and ignite-indexing are in classpath
when you run the test.
> Fix several issues in PageMemoryTracker
> ---------------------------------------
>
> Key: IGNITE-11367
> URL: https://issues.apache.org/jira/browse/IGNITE-11367
> Project: Ignite
> Issue Type: Bug
> Reporter: Ivan Rakov
> Priority: Major
> Fix For: 2.8
>
>
> I've discovered some issues in PageMemoryTracker while debugging IGNITE-10873:
> 1) Mock page memory doesn't implement PageMemoryImpl#pageBuffer. As a result,
> some delta records (which applies changes via buffer) can't be applied.
> Example:
> {code:java}
> Caused by: java.lang.NullPointerException
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.getLastSnapshotTag0(TrackingPageIO.java:235)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.getLastSnapshotTag(TrackingPageIO.java:227)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.validateSnapshotTag(TrackingPageIO.java:135)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.io.TrackingPageIO.markChanged(TrackingPageIO.java:93)
> at
> org.apache.ignite.internal.pagemem.wal.record.delta.TrackingPageDeltaRecord.applyDelta(TrackingPageDeltaRecord.java:75)
> at
> org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker.applyWalRecord(PageMemoryTracker.java:447)
> at
> org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker.access$000(PageMemoryTracker.java:81)
> at
> org.apache.ignite.internal.processors.cache.persistence.wal.memtracker.PageMemoryTracker$1.log(PageMemoryTracker.java:159)
> at
> org.gridgain.grid.internal.processors.cache.database.snapshot.GridCacheSnapshotManager.onChangeTrackerPage(GridCacheSnapshotManager.java:2801)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$5.applyx(GridCacheDatabaseSharedManager.java:1084)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$5.applyx(GridCacheDatabaseSharedManager.java:1077)
> at
> org.apache.ignite.internal.util.lang.GridInClosure3X.apply(GridInClosure3X.java:34)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1572)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:495)
> at
> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:487)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:394)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:369)
> at
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:285)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11500(BPlusTree.java:92)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryReplace(BPlusTree.java:3638)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2565)
> at
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doPut(BPlusTree.java:2293)
> ... 33 more
> {code}
> 2) During binary recovery phase, page memory is changed by applying delta
> records and page snapshots (GridCacheDatabaseSharedManager#applyPageDelta,
> #applyPageSnapshot). Such changes are not replicated by logging delta records
> to WAL (we don't log physical records on binary recovery - we just apply
> already logged ones). This leads to false positive broken consistency
> reports. To prevent this, we should apply changes to both regular page memory
> and mock page memory in PageMemoryTracker.
> 3) PagesList.java:918:
> {code:java}
> // Here we should never write full page, because it
> is known to be new.
> if (needWalDeltaRecord(nextId, nextPage, FALSE))
> wal.log(new PagesListInitNewPageRecord(
> grpId,
> nextId,
> io.getType(),
> io.getVersion(),
> nextId,
> prevId,
> 0L
> ));
> {code}
> Sometimes we may log InitNewPageRecord without logging page snapshot. In case
> page was recycled before, its content in unused page part may trigger false
> positive reports. We may fix this by filling page content with zeros during
> PagesList#setupNextPage call.
> Modified test from attached patch should pass if all mentioned issues will be
> fixed. Please ensure that both Mockito and ignite-indexing are in classpath
> when you run the test.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)