[
https://issues.apache.org/jira/browse/IGNITE-20002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-20002:
-------------------------------------
Description:
h3. Motivation
It's required to release all acquired locks on transaction finish in a durable
way. Such durability consists of two parts:
* Durable unlock within same primary.
* Durable unlock on primary change.
This ticket is about second part only. There's a counterpart ticket for the
first part https://issues.apache.org/jira/browse/IGNITE-20004
h3. Definition of Done
* All unreleased locks for the transactions that were finished are released in
case of primary re-election, including old primary failure and cluster restart.
h3. Implementation Notes
* We may start with adding onPrimaryElected callback.
* Within this callback, it's required to scan
`org.apache.ignite.internal.tx.storage.state.TxStateStorage#scan` local
TxStateStorage and call `org.apache.ignite.internal.tx.TxManager#cleanup` for
all transactions that have false in TxMeta.locksReleased. TxManager#cleanup is
an idempotent operation, thus it's safe to run it multiple time, even from
different nodes, e.g. old primary and new primary.
* It's required to add locksReleased field to TxMeta with default value false.
* It's required to set locksReleases to true when all cleanup
txCleanupReplicaRequest returns successfully. That extra
"updateTxnState(locksReleased == true) should be asynchronous.
* Tests will be non-trivial here, because it'll be required to kill old
primary after txnStateChanged but before sending cleanup request.
was:
h3. Motivation
It's required to release all acquired locks on transaction finish in a durable
way. Such durability consists of two parts:
* Durable unlock within same primary.
* Durable unlock on primary change.
This ticket is about second part only.
h3. Definition of Done
* All unreleased locks for the transactions that were finished are released in
case of primary re-election, including old primary failure and cluster restart.
h3. Implementation Notes
* We may start with adding onPrimaryElected callback.
* Within this callback, it's required to scan
`org.apache.ignite.internal.tx.storage.state.TxStateStorage#scan` local
TxStateStorage and call `org.apache.ignite.internal.tx.TxManager#cleanup` for
all transactions that have false in TxMeta.locksReleased. TxManager#cleanup is
an idempotent operation, thus it's safe to run it multiple time, even from
different nodes, e.g. old primary and new primary.
* It's required to add locksReleased field to TxMeta with default value false.
* It's required to set locksReleases to true when all cleanup
txCleanupReplicaRequest returns successfully. That extra
"updateTxnState(locksReleased == true) should be asynchronous.
* Tests will be non-trivial here, because it'll be required to kill old
primary after txnStateChanged but before sending cleanup request.
> Implement durable unlock on primary partition re-election
> ---------------------------------------------------------
>
> Key: IGNITE-20002
> URL: https://issues.apache.org/jira/browse/IGNITE-20002
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Priority: Major
> Labels: ignite-3, transaction, transaction3_recovery
>
> h3. Motivation
> It's required to release all acquired locks on transaction finish in a
> durable way. Such durability consists of two parts:
> * Durable unlock within same primary.
> * Durable unlock on primary change.
> This ticket is about second part only. There's a counterpart ticket for the
> first part https://issues.apache.org/jira/browse/IGNITE-20004
> h3. Definition of Done
> * All unreleased locks for the transactions that were finished are released
> in case of primary re-election, including old primary failure and cluster
> restart.
> h3. Implementation Notes
> * We may start with adding onPrimaryElected callback.
> * Within this callback, it's required to scan
> `org.apache.ignite.internal.tx.storage.state.TxStateStorage#scan` local
> TxStateStorage and call `org.apache.ignite.internal.tx.TxManager#cleanup` for
> all transactions that have false in TxMeta.locksReleased. TxManager#cleanup
> is an idempotent operation, thus it's safe to run it multiple time, even from
> different nodes, e.g. old primary and new primary.
> * It's required to add locksReleased field to TxMeta with default value
> false.
> * It's required to set locksReleases to true when all cleanup
> txCleanupReplicaRequest returns successfully. That extra
> "updateTxnState(locksReleased == true) should be asynchronous.
> * Tests will be non-trivial here, because it'll be required to kill old
> primary after txnStateChanged but before sending cleanup request.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)