[ 
https://issues.apache.org/jira/browse/IGNITE-26680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-26680:
---------------------------------------
    Description: 
The following scenario is possible:
 # Node is started when Metastorage has some revisions (for instance, the node 
is restarted)
 # It creates a Metastorage Raft node
 # The Raft node gets all Metastorage revisions fast
 # Metastorage recovery kicks in, it asks the leader for current revision
 # It sees that current revision is already reached, so it completes the 
Metastorage recovery future right away. This happens in a Raft-ReadOnly thread 
(not in the state machine thread). This causes the recoveryRevisionsListener to 
be nullified
 # At the same time, a new revision comes from the leader and gets applied in 
the state machine thread. It tries to invoke the recoveryRevisionsListener

As it can be seen, there is a race between nullification of the listener and 
its invocation. Currently, the null-check and invocation both read the volatile 
field (of the listener), so it is possible for the command execution to see 
that the listener is not null and then attempt to dereference a null reference 
set on step 5, getting a NullPointerException.

This can be easily fixed by first reading the field once and then doing both 
the null check and invocation on the local variable.

The race still remains, but it's benign as a redundant listener invocation has 
no effect (it just tries to complete the already completed future, which is 
no-op).

> Potential NullPointerException on metastorage recovery
> ------------------------------------------------------
>
>                 Key: IGNITE-26680
>                 URL: https://issues.apache.org/jira/browse/IGNITE-26680
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Roman Puchkovskiy
>            Assignee: Roman Puchkovskiy
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain, ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following scenario is possible:
>  # Node is started when Metastorage has some revisions (for instance, the 
> node is restarted)
>  # It creates a Metastorage Raft node
>  # The Raft node gets all Metastorage revisions fast
>  # Metastorage recovery kicks in, it asks the leader for current revision
>  # It sees that current revision is already reached, so it completes the 
> Metastorage recovery future right away. This happens in a Raft-ReadOnly 
> thread (not in the state machine thread). This causes the 
> recoveryRevisionsListener to be nullified
>  # At the same time, a new revision comes from the leader and gets applied in 
> the state machine thread. It tries to invoke the recoveryRevisionsListener
> As it can be seen, there is a race between nullification of the listener and 
> its invocation. Currently, the null-check and invocation both read the 
> volatile field (of the listener), so it is possible for the command execution 
> to see that the listener is not null and then attempt to dereference a null 
> reference set on step 5, getting a NullPointerException.
> This can be easily fixed by first reading the field once and then doing both 
> the null check and invocation on the local variable.
> The race still remains, but it's benign as a redundant listener invocation 
> has no effect (it just tries to complete the already completed future, which 
> is no-op).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to