[
https://issues.apache.org/jira/browse/IGNITE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Pyatkov updated IGNITE-20052:
---------------------------------------
Fix Version/s: 3.0.0-beta2
> Release all locks locally on self primary replica expiration
> ------------------------------------------------------------
>
> Key: IGNITE-20052
> URL: https://issues.apache.org/jira/browse/IGNITE-20052
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Assignee: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3, transaction3_recovery, transactions
> Fix For: 3.0.0-beta2
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> h3. Motivation
> It is not only useless, but also harmful to keep locks on an expired primary
> because corresponding commitTimestamps are either calculated or the
> transaction will be aborted.
> h3. Definition of Done
> * All local partition specific locks are released on self primary replica
> expiration.
> h3. Implementation Notes
> * It's required to introduce local onPrimaryExpired callback.
> * An open question here is how to detect whether a given primary hosted any
> locks.
> * We've discussed and agreed that the following test should be written. It
> might not be the only one to write, however it's definitely useful.
> ** Start two nodes A and B with partition P1 on node A and partition P2 on
> node B.
> ** Begin transaction Tx1 on node B.
> ** Touch P2 on B
> ** Touch P1 on A
> ** Kill Node B, meaning kill Tx coordinator, commit partition and P2.
> ** Discard P1 lease prolongation.
> ** Await P1 lease expiration and check that locks were released.
> * In order to discard lease prolongation, we may add special placement
> driver methods that will add an ability to discard or transfer lease. At
> least they'll be useful within testing.
> * We've agreed that we may duplicate
> org.apache.ignite.internal.table.distributed.raft.PartitionListener#txsPendingRowIds
> in order to have it in PartitionReplicaListener. That will allow us to
> handle primaryReplica.onExpired() in a following way (pseudocode)
> {code:java}
> txsPendingRowIds.keySet().forEach(txId -> lockManager.unlock(txId)){code}
> Besides that, given map is required in order to cleanup writeIntents on
> primary if primary isn't a part of a replication group.
> * Seems that we don't need the whole map, but only the keySet, meaning
> txIds. Because corresponding value Set<RowId> is used in order to cleanup
> writeIntetns and the only case why we need rowIds on primary explicitly is
> when primary itself ins't the part of replication group. And if it's true
> rebalance engine will drop the whole old primary local partition with all
> corresponding write intents.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)