[ 
https://issues.apache.org/jira/browse/IGNITE-23223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882724#comment-17882724
 ] 

Roman Puchkovskiy commented on IGNITE-23223:
--------------------------------------------

Thanks!

> An NPE may happen in CMG state machine during init
> --------------------------------------------------
>
>                 Key: IGNITE-23223
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23223
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Roman Puchkovskiy
>            Assignee: Roman Puchkovskiy
>            Priority: Blocker
>              Labels: ignite-3
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It looks like this:
> 2024-09-17 13:26:14:125 +0000 
> [ERROR][%poc-tester-SERVER-192.168.208.65-id-0%JRaft-FSMCaller-Disruptor_stripe_0-0][FailureManager]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=CRITICAL_ERROR]
> java.lang.AssertionError: clusterId cannot be null when commands are already 
> being executed by the CMG state machine
>     at 
> org.apache.ignite.internal.cluster.management.topology.LogicalTopologyImpl.requiredClusterId(LogicalTopologyImpl.java:133)
>     at 
> org.apache.ignite.internal.cluster.management.topology.LogicalTopologyImpl.putNode(LogicalTopologyImpl.java:114)
>     at 
> org.apache.ignite.internal.cluster.management.raft.CmgRaftGroupListener.completeValidation(CmgRaftGroupListener.java:257)
>     at 
> org.apache.ignite.internal.cluster.management.raft.CmgRaftGroupListener.onWriteBusy(CmgRaftGroupListener.java:173)
>     at 
> org.apache.ignite.internal.cluster.management.raft.CmgRaftGroupListener.onWrite(CmgRaftGroupListener.java:148)
>     at 
> org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:731)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:571)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:539)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:458)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:131)
>     at 
> org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:125)
>     at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:326)
>     at 
> org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:283)
>     at 
> com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:167)
>     at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:122)
>     at java.base/java.lang.Thread.run(Thread.java:829)
>  
> This is caused by a race: a node might start executing commands modifying 
> logical topology (all of them require a clusterId) before the clusterId gets 
> set.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to