[
https://issues.apache.org/jira/browse/IGNITE-17578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denis Chudov updated IGNITE-17578:
----------------------------------
Description:
h3. Motivation
According to tx commit process design it's required to return the control to
the outer logic right after COMMITED/ABORTED txn state replication. Follow-up
cleanup process, that will send replica cleanup requests to all enlisted
replication groups should be asynchronous.
Currently it's not true:
{code:java}
/**
* Process transaction finish request:
* <ol>
* <li>Evaluate commit timestamp.</li>
* <li>Run specific raft {@code FinishTxCommand} command, that will apply
txn state to corresponding txStateStorage.</li>
* <li>Send cleanup requests to all enlisted primary replicas.</li>
* </ol>
* This operation is NOT idempotent, because of commit timestamp evaluation.
*
* @param request Transaction finish request.
* @return future result of the operation.
*/
private CompletableFuture<Object> processTxFinishAction(TxFinishRequest
request) {
HybridTimestamp commitTimestamp = hybridClock.now();
List<String> aggregatedGroupIds =
request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());
UUID txId = request.txId();
boolean commit = request.commit();
CompletableFuture<Object> chaneStateFuture = raftClient.run(
new FinishTxCommand(
txId,
commit,
commitTimestamp,
aggregatedGroupIds
)
);
// TODO: https://issues.apache.org/jira/browse/IGNITE-17578
chaneStateFuture.thenRun(
() -> request.groups().forEach(
(recipientNode, replicationGroupIds) -> txManager.cleanup(
recipientNode,
replicationGroupIds,
txId,
commit,
commitTimestamp
)
)
);
return chaneStateFuture;
}
{code}
Besides aforementioned, it's expected that cleanup process (that is guaranteed
to be idempotent) should be performed until success.
h3. Definition of Done
* Sending cleanup request should be implemented in an async format.
* Cleanup failures, including timeouts should trigger one more cleanup until
success. There's no failure handler currently, so it's the only option.
h3. Implementation Notes
Seems that, properly shared between replicas, cleanup executor will suite us.
The executor is needed to have ability to plan the next attempt of cleanup in
case of failure, so that such attempt would be performed not right after the
failure but after successful rehashing of replicas when their state allows to
perform the cleanup attempt with high possibility of success.
was:
h3. Motivation
According to tx commit process design it's required to return the control to
the outer logic right after COMMITED/ABORTED txn state replication. Follow-up
cleanup process, that will send replica cleanup requests to all enlisted
replication groups should be asynchronous.
Currently it's not true:
{code:java}
/**
* Process transaction finish request:
* <ol>
* <li>Evaluate commit timestamp.</li>
* <li>Run specific raft {@code FinishTxCommand} command, that will apply
txn state to corresponding txStateStorage.</li>
* <li>Send cleanup requests to all enlisted primary replicas.</li>
* </ol>
* This operation is NOT idempotent, because of commit timestamp evaluation.
*
* @param request Transaction finish request.
* @return future result of the operation.
*/
private CompletableFuture<Object> processTxFinishAction(TxFinishRequest
request) {
HybridTimestamp commitTimestamp = hybridClock.now();
List<String> aggregatedGroupIds =
request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());
UUID txId = request.txId();
boolean commit = request.commit();
CompletableFuture<Object> chaneStateFuture = raftClient.run(
new FinishTxCommand(
txId,
commit,
commitTimestamp,
aggregatedGroupIds
)
);
// TODO: https://issues.apache.org/jira/browse/IGNITE-17578
chaneStateFuture.thenRun(
() -> request.groups().forEach(
(recipientNode, replicationGroupIds) -> txManager.cleanup(
recipientNode,
replicationGroupIds,
txId,
commit,
commitTimestamp
)
)
);
return chaneStateFuture;
}
{code}
Besides aforementioned, it's expected that cleanup process (that is guaranteed
to be idempotent) should be performed until success.
h3. Definition of Done
* Sending cleanup request should be implemented in an async format.
* Cleanup failures, including timeouts should trigger one more cleanup until
success. There's no failure handler currently, so it's the only option.
h3. Implementation Notes
Seems that, properly shared between replicas, cleanup executor will suite us.
> Transactions: async cleanup processing on tx commit
> ---------------------------------------------------
>
> Key: IGNITE-17578
> URL: https://issues.apache.org/jira/browse/IGNITE-17578
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Priority: Major
> Labels: ignite-3, transaction3_rw
>
> h3. Motivation
> According to tx commit process design it's required to return the control to
> the outer logic right after COMMITED/ABORTED txn state replication. Follow-up
> cleanup process, that will send replica cleanup requests to all enlisted
> replication groups should be asynchronous.
> Currently it's not true:
> {code:java}
> /**
> * Process transaction finish request:
> * <ol>
> * <li>Evaluate commit timestamp.</li>
> * <li>Run specific raft {@code FinishTxCommand} command, that will apply
> txn state to corresponding txStateStorage.</li>
> * <li>Send cleanup requests to all enlisted primary replicas.</li>
> * </ol>
> * This operation is NOT idempotent, because of commit timestamp evaluation.
> *
> * @param request Transaction finish request.
> * @return future result of the operation.
> */
> private CompletableFuture<Object> processTxFinishAction(TxFinishRequest
> request) {
> HybridTimestamp commitTimestamp = hybridClock.now();
> List<String> aggregatedGroupIds =
> request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());
> UUID txId = request.txId();
> boolean commit = request.commit();
> CompletableFuture<Object> chaneStateFuture = raftClient.run(
> new FinishTxCommand(
> txId,
> commit,
> commitTimestamp,
> aggregatedGroupIds
> )
> );
> // TODO: https://issues.apache.org/jira/browse/IGNITE-17578
> chaneStateFuture.thenRun(
> () -> request.groups().forEach(
> (recipientNode, replicationGroupIds) -> txManager.cleanup(
> recipientNode,
> replicationGroupIds,
> txId,
> commit,
> commitTimestamp
> )
> )
> );
> return chaneStateFuture;
> }
> {code}
> Besides aforementioned, it's expected that cleanup process (that is
> guaranteed to be idempotent) should be performed until success.
> h3. Definition of Done
> * Sending cleanup request should be implemented in an async format.
> * Cleanup failures, including timeouts should trigger one more cleanup until
> success. There's no failure handler currently, so it's the only option.
> h3. Implementation Notes
> Seems that, properly shared between replicas, cleanup executor will suite us.
> The executor is needed to have ability to plan the next attempt of cleanup in
> case of failure, so that such attempt would be performed not right after the
> failure but after successful rehashing of replicas when their state allows to
> perform the cleanup attempt with high possibility of success.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)