[
https://issues.apache.org/jira/browse/IGNITE-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Tkalenko updated IGNITE-20678:
-------------------------------------
Description:
After discussions and code analysis, I found out that this problem needs to be
solved using method
*org.apache.ignite.internal.placementdriver.PlacementDriver#getPrimaryReplica*
on recovery.
h3. But now it has the following bug:
When restarting the cluster (for simplicity, a cluster of one node) on recovery
using *PlacementDriver#getPrimaryReplica*, we can get that the local node is a
primary replica that has not yet expired
(*org.apache.ignite.internal.placementdriver.ReplicaMeta#getExpirationTime* <
now). Then start building the index, but the index was already built; it’s just
that the replication log did not have time to be applied.
h3. How to fix the bug:
Add field
*org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*,
meaning this is the node ID that is assigned at the start of the node
(*org.apache.ignite.network.ClusterNode#id*), which will be needed to check
whether the local replica is the primary one.
If during recovery we use *PlacementDriver#getPrimaryReplica* and get that we
are not the primary replica, then we will not build an index, otherwise an
honest selection of the primary replica will occur using the replication log.
h3. Total:
Corrections related to index building recovey will be made in IGNITE-20544 and
IGNITE-20637.
In this ticket,
*org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId* will
be added and modified so that it works correctly, for example, its
serialization/deserialization in
*org.apache.ignite.internal.placementdriver.leases.Lease*, and also so that the
prolongation of the lease does not occur if *ReplicaMeta#getLeaseholderId*
changes.
was:
After discussions and code analysis, I found out that this problem needs to be
solved using method
*org.apache.ignite.internal.placementdriver.PlacementDriver#getPrimaryReplica*
on recovery.
h3. But now it has the following bug:
When restarting the cluster (for simplicity, a cluster of one node) on recovery
using *PlacementDriver#getPrimaryReplica*, we can get that the local node is a
primary replica that has not yet expired
(*org.apache.ignite.internal.placementdriver.ReplicaMeta#getExpirationTime* <
now). Then start building the index, but the index was already built; it’s just
that the replication log did not have time to be applied.
h3. How to fix the bug:
Add field
*org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*,
meaning this is the node ID that is assigned at the start of the node
(org.apache.ignite.network.ClusterNode#id), which will be needed to check
whether the local replica is the primary one.
If during recovery we use *PlacementDriver#getPrimaryReplica* and get that we
are not the primary replica, then we will not build an index, otherwise an
honest selection of the primary replica will occur using the replication log.
> Adding ReplicaMeta#getLeaseholderId to avoid errors during node recovery
> ------------------------------------------------------------------------
>
> Key: IGNITE-20678
> URL: https://issues.apache.org/jira/browse/IGNITE-20678
> Project: Ignite
> Issue Type: Improvement
> Reporter: Kirill Tkalenko
> Assignee: Kirill Tkalenko
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
> After discussions and code analysis, I found out that this problem needs to
> be solved using method
> *org.apache.ignite.internal.placementdriver.PlacementDriver#getPrimaryReplica*
> on recovery.
> h3. But now it has the following bug:
> When restarting the cluster (for simplicity, a cluster of one node) on
> recovery using *PlacementDriver#getPrimaryReplica*, we can get that the local
> node is a primary replica that has not yet expired
> (*org.apache.ignite.internal.placementdriver.ReplicaMeta#getExpirationTime* <
> now). Then start building the index, but the index was already built; it’s
> just that the replication log did not have time to be applied.
> h3. How to fix the bug:
> Add field
> *org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*,
> meaning this is the node ID that is assigned at the start of the node
> (*org.apache.ignite.network.ClusterNode#id*), which will be needed to check
> whether the local replica is the primary one.
> If during recovery we use *PlacementDriver#getPrimaryReplica* and get that we
> are not the primary replica, then we will not build an index, otherwise an
> honest selection of the primary replica will occur using the replication log.
> h3. Total:
> Corrections related to index building recovey will be made in IGNITE-20544
> and IGNITE-20637.
> In this ticket,
> *org.apache.ignite.internal.placementdriver.ReplicaMeta#getLeaseholderId*
> will be added and modified so that it works correctly, for example, its
> serialization/deserialization in
> *org.apache.ignite.internal.placementdriver.leases.Lease*, and also so that
> the prolongation of the lease does not occur if
> *ReplicaMeta#getLeaseholderId* changes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)