Alexander Lapin created IGNITE-17578:
----------------------------------------
Summary: Transactions: async cleanup processing on tx commit
Key: IGNITE-17578
URL: https://issues.apache.org/jira/browse/IGNITE-17578
Project: Ignite
Issue Type: Improvement
Reporter: Alexander Lapin
h3. Motivation
According to tx commit process design it's required to return the control to
the outer logic right after COMMITED/ABORTED txn state replication. Follow-up
cleanup process, that will send replica cleanup requests to all enlisted
replication groups should be asynchronous.
Currently it's not true:
{code:java}
/**
* Process transaction finish request:
* <ol>
* <li>Evaluate commit timestamp.</li>
* <li>Run specific raft {@code FinishTxCommand} command, that will apply
txn state to corresponding txStateStorage.</li>
* <li>Send cleanup requests to all enlisted primary replicas.</li>
* </ol>
* This operation is NOT idempotent, because of commit timestamp evaluation.
*
* @param request Transaction finish request.
* @return future result of the operation.
*/
private CompletableFuture<Object> processTxFinishAction(TxFinishRequest
request) {
HybridTimestamp commitTimestamp = hybridClock.now();
List<String> aggregatedGroupIds =
request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());
UUID txId = request.txId();
boolean commit = request.commit();
CompletableFuture<Object> chaneStateFuture = raftClient.run(
new FinishTxCommand(
txId,
commit,
commitTimestamp,
aggregatedGroupIds
)
);
// TODO: sanpwc create ticket for async cleanup.
chaneStateFuture.thenRun(
() -> request.groups().forEach(
(recipientNode, replicationGroupIds) -> txManager.cleanup(
recipientNode,
replicationGroupIds,
txId,
commit,
commitTimestamp
)
)
);
return chaneStateFuture;
}
{code}
Besides aforementioned, it's expected that cleanup process (that is guaranteed
to be idempotent) should be performed until success.
h3. Implementation Criteria
* Sending cleanup request should be implemented in an async format.
* Cleanup failures, including timeouts should trigger one more cleanup until
success. There's no failure handler currently, so it's the only option.
h3. Implementation Details
Seems that, properly shared between replicas, cleanup executor will suite us.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)