[ 
https://issues.apache.org/jira/browse/IGNITE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725746#comment-16725746
 ] 

Alexey Goncharuk commented on IGNITE-9303:
------------------------------------------

[~ilantukh], a few comments:
1) There is a debug statement to be removed in 
{{GridCacheDatabaseSharedManager}}
2) I do not like a new assert in {{performBinaryMemoryRestore}}. First, let's 
move the read of WAL pointer to a separate method (note that try-finally is 
missed for iterator create-close). Second, if we were not able to read the 
{{CheckpointRecord}}, we need to collect as much information as possible 
(checkpoint ID, start pointer, the record we actually read, etc) and raise a 
critical system error. Given that PDS cleanup is the only workaround in such an 
error, this workaround should be suggested to the user in the exception 
message. Feel free to add any other suggestions, if any.
3) Why do we still need {{restore=true}} flag during recovery? I think in 
current recovery scheme the flag is not needed and it will only hide other bugs 
if they are present or added.
4) In {{IgniteSequentialNodeCrashRecoveryTest}} let's add a check that extra 
dirty pages set is not empty, and also let's check that re-started grid is 
operable (cache gets and puts are working)

> PageSnapshot can contain wrong pageId tag when not dirty page is recycling
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-9303
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9303
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.6
>            Reporter: Aleksey Plekhanov
>            Assignee: Ilya Lantukh
>            Priority: Major
>             Fix For: 2.8
>
>
> When page is recycling (for example in {{BPlusTree.Remove#freePage()}} -> 
> {{DataStructure#recyclePage()}}) tag of {{pageId}} is modified, but original 
> {{pageId}} is passed to {{writeUnlock()}} method and this passed {{pageId}} 
> is stored to PageSnapshot WAL record.
> This bug may lead to errors in WAL applying during crash recovery.
> Reproducer (ignite-indexing module must be in classpath):
> {code:java}
> public class WalFailReproducer extends AbstractWalDeltaConsistencyTest {
>     @Override protected boolean checkPagesOnCheckpoint() {
>         return true;
>     }
>     public final void testPutRemoveCacheDestroy() throws Exception {
>         CacheConfiguration<Integer, Integer> ccfg = new 
> CacheConfiguration<>("cache0");
>         ccfg.setIndexedTypes(Integer.class, Integer.class);
>         IgniteEx ignite = startGrid(0);
>         ignite.cluster().active(true);
>         IgniteCache<Integer, Integer> cache0 = ignite.getOrCreateCache(ccfg);
>         for (int i = 0; i < 5_000; i++)
>             cache0.put(i, i);
>         forceCheckpoint();
>         for (int i = 1_000; i < 4_000; i++)
>             cache0.remove(i);
>         forceCheckpoint();
>         stopAllGrids();
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to