[
https://issues.apache.org/jira/browse/IGNITE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-20052:
-------------------------------------
Description:
h3. Motivation
It is not only useless, but also harmful to keep locks on an expired primary
because corresponding commitTimestamps are either calculated or the transaction
will be aborted.
h3. Definition of Done
* All local partition specific locks are released on self primary replica
expiration.
h3. Implementation Notes
* It's required to introduce local onPrimaryExpired callback.
* An open question here is how to detect whether a given primary hosted any
locks.
* We've discussed and agreed that the following test should be written. It
might not be the only one to write, however it's definitely useful.
** Start two nodes A and B with partition P1 on node A and partition P2 on
node B.
** Begin transaction Tx1 on node B.
** Touch P2 on B
** Touch P1 on A
** Kill Node B, meaning kill Tx coordinator, commit partition and P2.
** Discard P1 lease prolongation.
** Await P1 lease expiration and check that locks were released.
* In order to discard lease prolongation, we may add special placement driver
methods that will add an ability to discard or transfer lease. At least they'll
be useful within testing.
* We've agreed that we may duplicate
org.apache.ignite.internal.table.distributed.raft.PartitionListener#txsPendingRowIds
in order to have it in PartitionReplicaListener. That will allow us to handle
primaryReplica.onExpired() in a following way (pseudocode)
{code:java}
txsPendingRowIds.keySet().forEach(txId -> lockManager.unlock(txId)){code}
Besides that, given map is required in order to cleanup writeIntents on primary
if primary isn't a part of a replication group.
was:
h3. Motivation
It is not only useless, but also harmful to keep locks on an expired primary
because corresponding commitTimestamps are either calculated or the transaction
will be aborted.
h3. Definition of Done
* All local partition specific locks are released on self primary replica
expiration.
h3. Implementation Notes
* It's required to introduce local onPrimaryExpired callback.
* An open question here is how to detect whether a given primary hosted any
locks.
> Release all locks locally on self primary replica expiration
> ------------------------------------------------------------
>
> Key: IGNITE-20052
> URL: https://issues.apache.org/jira/browse/IGNITE-20052
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Priority: Major
> Labels: ignite-3, transaction3_recovery, transactions
>
> h3. Motivation
> It is not only useless, but also harmful to keep locks on an expired primary
> because corresponding commitTimestamps are either calculated or the
> transaction will be aborted.
> h3. Definition of Done
> * All local partition specific locks are released on self primary replica
> expiration.
> h3. Implementation Notes
> * It's required to introduce local onPrimaryExpired callback.
> * An open question here is how to detect whether a given primary hosted any
> locks.
> * We've discussed and agreed that the following test should be written. It
> might not be the only one to write, however it's definitely useful.
> ** Start two nodes A and B with partition P1 on node A and partition P2 on
> node B.
> ** Begin transaction Tx1 on node B.
> ** Touch P2 on B
> ** Touch P1 on A
> ** Kill Node B, meaning kill Tx coordinator, commit partition and P2.
> ** Discard P1 lease prolongation.
> ** Await P1 lease expiration and check that locks were released.
> * In order to discard lease prolongation, we may add special placement
> driver methods that will add an ability to discard or transfer lease. At
> least they'll be useful within testing.
> * We've agreed that we may duplicate
> org.apache.ignite.internal.table.distributed.raft.PartitionListener#txsPendingRowIds
> in order to have it in PartitionReplicaListener. That will allow us to
> handle primaryReplica.onExpired() in a following way (pseudocode)
> {code:java}
> txsPendingRowIds.keySet().forEach(txId -> lockManager.unlock(txId)){code}
> Besides that, given map is required in order to cleanup writeIntents on
> primary if primary isn't a part of a replication group.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)