[
https://issues.apache.org/jira/browse/IGNITE-20995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17796781#comment-17796781
]
Denis Chudov commented on IGNITE-20995:
---------------------------------------
>From my point of view, we should have following scenarios:
# Lock conflict between two RW transactions, coordinator is lost for lock
holder, recovery starts, lock holder is aborted, lock waiter is not affected.
# Lock conflict between two RW transactions, coordinator is alive, recovery is
not started, lock waiter is not affected.
# Resolution of write intent belonging to tx without coordinator, recovery
happens successfully, write intent is switched.
# Resolution of write intent belonging to abandoned tx, recovery happens
successfully, write intent is switched.
# Resolution of write intent belonging to abandoned tx, commit partition has
restarted and lost its local volatile tx state map, recovery happens
successfully, write intent is switched.
# Resolution of write intent belonging to pending transaction, coordinator is
alive, recovery is not started.
# RO transaction tx0 resolves write intent belonging to the transaction tx1
and marks it as abandoned and starts the recovery; after that RW transaction
tx2 meets the lock belonging to tx1, sees that it's abandoned recently and
doesn't start the recovery. Recovery is triggered just once.
# Coordinator is lost, but it has sent the commit message to a commit
partition, in the same time the recovery initiating request is received from
some data node. Commit is successful, tx recovery was not able to change
transaction state, there are no assertions or other errors, write intents on
data node are switched.
# Coordinator is lost, but it has sent the commit message to a commit
partition, in the same time the recovery initiating request is received from
some data node. Recovery successfully aborts the transaction, the is correct
exception on the coordinator, it was not able to change transaction state to
commit, there are no assertions or other errors, write intents on data node are
switched.
# Parallel tx recoveries happen on two replicas of commit partition, both
processes were started at a moment when the corresponding replica was the
primary one. This also shouldnt break anything.
> Add more integration tests for tx recovery on unstable topology
> ---------------------------------------------------------------
>
> Key: IGNITE-20995
> URL: https://issues.apache.org/jira/browse/IGNITE-20995
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Assignee: Kirill Sizov
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> Surprisingly it might be useful to check tx recovery implementation with some
> tests.
> h3. Defintion of Done
> <Specific scenarios will be added a bit later>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)