[jira] [Commented] (IGNITE-19043) ItRaftCommandLeftInLogUntilRestartTest: PageMemoryHashIndexStorage lacks data after cluster restart

Denis Chudov (Jira) Wed, 29 Mar 2023 06:37:25 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-19043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706389#comment-17706389
 ]


Denis Chudov commented on IGNITE-19043:
---------------------------------------

[~alapin]  LGTM

> ItRaftCommandLeftInLogUntilRestartTest: PageMemoryHashIndexStorage lacks data 
> after cluster restart
> ---------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-19043
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19043
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexander Lapin
>            Assignee: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> After enabling ItRaftCommandLeftInLogUntilRestartTest failed with
> {code:java}
> org.opentest4j.AssertionFailedError: expected: not <null> {code}
> while trying to retrieve previously added data after cluster restart. Seems 
> that it's because there's no corresponding data in PK index.
> It is worth to mention that originally given test is about about raft log 
> re-application on node restart. So, I've commented all  
> partitionUpdateInhibitor in order to check whether it's related to 
> re-application or indexes themselves, problem is reproducible without 
> re-application logic.
> It might be related to rocks to page memory defaults migration. Further 
> investigation required.
> h3. Implementation notes
> After the investigation it's occurred that the reason of the failure is that 
> raft log re-appliance is skipped within PartitionListener#handleUpdateCommand 
> and PartitionListener#handleUpdateAllCommand because of following logic
> {code:java}
>         TxMeta txMeta = txStateStorage.get(cmd.txId());
>         if (txMeta != null && (txMeta.txState() == COMMITED || 
> txMeta.txState() == ABORTED)) {
>             storage.runConsistently(() -> {
>                 storage.lastApplied(commandIndex, commandTerm);
>                 return null;
>             });
>         } 
>  
> {code}
> Full scenario is following:
> 1. tx1.put populates raft log and mvPartitionStorage with corresponding log 
> record and data.
> 2. tx1.commit also populates raft log with raft record and finished the 
> transaction within txnStateStorage along wiht cleanup in mvPartitionStorage.
> 3. RocksDB based txnStateStorage flushes its state to a disk and page memory 
> based doesn't.
> 4. After node restart raft replays the log, both put and commit commands, 
> however on commit partition we skip put re-application  because of 
> aforementioned
> {code:java}
> if (txMeta != null && (txMeta.txState() == COMMITED || txMeta.txState() == 
> ABORTED)){code}
> Just in case, transaction is considered to be committed because 
> txnStateStorage flushes its state before stop.
>  
> So, in order to fix given issue it's enough to just remove the skip logic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-19043) ItRaftCommandLeftInLogUntilRestartTest: PageMemoryHashIndexStorage lacks data after cluster restart

Reply via email to