[ 
https://issues.apache.org/jira/browse/IGNITE-23907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-23907:
---------------------------------------
    Description: 
2024-12-06 18:00:02:905 +0000 
[ERROR][%node3%JRaft-FSMCaller-Disruptormetastorage_stripe_0-0][MetaStorageWriteHandler]
 Unknown error while processing command [commandIndex=59060, commandTerm=1, 
command=EvictIdempotentCommandsCacheCommandImpl 
[evictionTimestamp=HybridTimestamp [physical=2024-12-06 17:18:51:759 +0000, 
logical=55036, composite=113607018529412860], initiatorTime=HybridTimestamp 
[physical=2024-12-06 17:18:51:760 +0000, logical=1, 
composite=113607018529423361], safeTime=HybridTimestamp [physical=2024-12-06 
17:18:51:760 +0000, logical=3, composite=113607018529423363]]]
org.apache.ignite.internal.metastorage.exceptions.MetaStorageException: 
IGN-CMN-65535 TraceId:47b327f8-274f-4716-8ee8-97e9c51cb4e9 Metastorage revision 
checksum differs from a checksum for the same revision saved earlier. This 
probably means that the Metastorage has diverged. [revision=5915, 
existingChecksum=-7812799892490681793, newChecksum=866784937340784120]
    at 
org.apache.ignite.internal.metastorage.server.persistence.RocksDbKeyValueStorage.validateNoChecksumConflict(RocksDbKeyValueStorage.java:724)
    at 
org.apache.ignite.internal.metastorage.server.persistence.RocksDbKeyValueStorage.completeAndWriteBatch(RocksDbKeyValueStorage.java:689)
    at 
org.apache.ignite.internal.metastorage.server.persistence.RocksDbKeyValueStorage.removeAll(RocksDbKeyValueStorage.java:806)
    at 
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.evictIdempotentCommandsCache(MetaStorageWriteHandler.java:409)
    at 
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteWithTime(MetaStorageWriteHandler.java:229)
    at 
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleNonCachedWriteCommand(MetaStorageWriteHandler.java:160)
    at 
org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteCommand(MetaStorageWriteHandler.java:130)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.metastorage.server.raft.MetaStorageListener.onWrite(MetaStorageListener.java:216)
    at 
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:766)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:571)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:539)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:458)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:131)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:125)
    at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:326)
    at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:283)
    at 
com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
    at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
    at java.base/java.lang.Thread.run(Thread.java:840)

 

The problem seems to be caused by the fact that idempotent cache command 
eviction produces removeAll() calls in a non-repeatable way: that is, two 
executions of the same EvictIdempotentCommandsCacheCommand against the same 
underlying storage might calculate different checksums (because they are 
calculated for different orders of keys-to-be-removed).

> Node cannot be restarted due to Metastorage checksum mismatch
> -------------------------------------------------------------
>
>                 Key: IGNITE-23907
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23907
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Roman Puchkovskiy
>            Assignee: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>
> 2024-12-06 18:00:02:905 +0000 
> [ERROR][%node3%JRaft-FSMCaller-Disruptormetastorage_stripe_0-0][MetaStorageWriteHandler]
>  Unknown error while processing command [commandIndex=59060, commandTerm=1, 
> command=EvictIdempotentCommandsCacheCommandImpl 
> [evictionTimestamp=HybridTimestamp [physical=2024-12-06 17:18:51:759 +0000, 
> logical=55036, composite=113607018529412860], initiatorTime=HybridTimestamp 
> [physical=2024-12-06 17:18:51:760 +0000, logical=1, 
> composite=113607018529423361], safeTime=HybridTimestamp [physical=2024-12-06 
> 17:18:51:760 +0000, logical=3, composite=113607018529423363]]]
> org.apache.ignite.internal.metastorage.exceptions.MetaStorageException: 
> IGN-CMN-65535 TraceId:47b327f8-274f-4716-8ee8-97e9c51cb4e9 Metastorage 
> revision checksum differs from a checksum for the same revision saved 
> earlier. This probably means that the Metastorage has diverged. 
> [revision=5915, existingChecksum=-7812799892490681793, 
> newChecksum=866784937340784120]
>     at 
> org.apache.ignite.internal.metastorage.server.persistence.RocksDbKeyValueStorage.validateNoChecksumConflict(RocksDbKeyValueStorage.java:724)
>     at 
> org.apache.ignite.internal.metastorage.server.persistence.RocksDbKeyValueStorage.completeAndWriteBatch(RocksDbKeyValueStorage.java:689)
>     at 
> org.apache.ignite.internal.metastorage.server.persistence.RocksDbKeyValueStorage.removeAll(RocksDbKeyValueStorage.java:806)
>     at 
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.evictIdempotentCommandsCache(MetaStorageWriteHandler.java:409)
>     at 
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteWithTime(MetaStorageWriteHandler.java:229)
>     at 
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleNonCachedWriteCommand(MetaStorageWriteHandler.java:160)
>     at 
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageWriteHandler.handleWriteCommand(MetaStorageWriteHandler.java:130)
>     at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
>     at 
> org.apache.ignite.internal.metastorage.server.raft.MetaStorageListener.onWrite(MetaStorageListener.java:216)
>     at 
> org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:766)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:571)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:539)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:458)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:131)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:125)
>     at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:326)
>     at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:283)
>     at 
> com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
>     at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
>     at java.base/java.lang.Thread.run(Thread.java:840)
>  
> The problem seems to be caused by the fact that idempotent cache command 
> eviction produces removeAll() calls in a non-repeatable way: that is, two 
> executions of the same EvictIdempotentCommandsCacheCommand against the same 
> underlying storage might calculate different checksums (because they are 
> calculated for different orders of keys-to-be-removed).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to