[ 
https://issues.apache.org/jira/browse/IGNITE-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-20678:
-------------------------------------
    Description: 
After discussions and code analysis, I found out that this problem needs to be 
solved using method 
*org.apache.ignite.internal.placementdriver.PlacementDriver#getPrimaryReplica* 
on recovery. 

h3. But now it has the following bug:
When restarting the cluster (for simplicity, a cluster of one node) on recovery 
using *PlacementDriver#getPrimaryReplica*, we can get that the local node is a 
primary replica that has not yet expired 
(*org.apache.ignite.internal.placementdriver.ReplicaMeta#getExpirationTime* < 
now). Then start building the index, but the index was already built; it’s just 
that the replication log did not have time to be applied.

h3. How to fix the bug:
Add field 
*org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*, 
meaning this is the node ID that is assigned at the start of the node 
(*org.apache.ignite.network.ClusterNode#id*), which will be needed to check 
whether the local replica is the primary one.
If during recovery we use *PlacementDriver#getPrimaryReplica* and get that we 
are not the primary replica, then we will not build an index, otherwise an 
honest selection of the primary replica will occur using the replication log.

h3. Total:
Corrections related to index building recovey will be made in IGNITE-20544 and 
IGNITE-20637.
In this ticket, 
*org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId* will 
be added and modified so that it works correctly, for example, its 
serialization/deserialization in 
*org.apache.ignite.internal.placementdriver.leases.Lease*, and also so that the 
prolongation of the lease does not occur if *ReplicaMeta#getLeaseholderId* 
changes.



  was:
After discussions and code analysis, I found out that this problem needs to be 
solved using method 
*org.apache.ignite.internal.placementdriver.PlacementDriver#getPrimaryReplica* 
on recovery. 

h3. But now it has the following bug:
When restarting the cluster (for simplicity, a cluster of one node) on recovery 
using *PlacementDriver#getPrimaryReplica*, we can get that the local node is a 
primary replica that has not yet expired 
(*org.apache.ignite.internal.placementdriver.ReplicaMeta#getExpirationTime* < 
now). Then start building the index, but the index was already built; it’s just 
that the replication log did not have time to be applied.

h3. How to fix the bug:
Add field 
*org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*, 
meaning this is the node ID that is assigned at the start of the node 
(org.apache.ignite.network.ClusterNode#id), which will be needed to check 
whether the local replica is the primary one.
If during recovery we use *PlacementDriver#getPrimaryReplica* and get that we 
are not the primary replica, then we will not build an index, otherwise an 
honest selection of the primary replica will occur using the replication log.




> Adding ReplicaMeta#getLeaseholderId to avoid errors during node recovery
> ------------------------------------------------------------------------
>
>                 Key: IGNITE-20678
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20678
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Kirill Tkalenko
>            Assignee: Kirill Tkalenko
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> After discussions and code analysis, I found out that this problem needs to 
> be solved using method 
> *org.apache.ignite.internal.placementdriver.PlacementDriver#getPrimaryReplica*
>  on recovery. 
> h3. But now it has the following bug:
> When restarting the cluster (for simplicity, a cluster of one node) on 
> recovery using *PlacementDriver#getPrimaryReplica*, we can get that the local 
> node is a primary replica that has not yet expired 
> (*org.apache.ignite.internal.placementdriver.ReplicaMeta#getExpirationTime* < 
> now). Then start building the index, but the index was already built; it’s 
> just that the replication log did not have time to be applied.
> h3. How to fix the bug:
> Add field 
> *org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*, 
> meaning this is the node ID that is assigned at the start of the node 
> (*org.apache.ignite.network.ClusterNode#id*), which will be needed to check 
> whether the local replica is the primary one.
> If during recovery we use *PlacementDriver#getPrimaryReplica* and get that we 
> are not the primary replica, then we will not build an index, otherwise an 
> honest selection of the primary replica will occur using the replication log.
> h3. Total:
> Corrections related to index building recovey will be made in IGNITE-20544 
> and IGNITE-20637.
> In this ticket, 
> *org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId* 
> will be added and modified so that it works correctly, for example, its 
> serialization/deserialization in 
> *org.apache.ignite.internal.placementdriver.leases.Lease*, and also so that 
> the prolongation of the lease does not occur if 
> *ReplicaMeta#getLeaseholderId* changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to