anton-vinogradov commented on a change in pull request #8822:
URL: https://github.com/apache/ignite/pull/8822#discussion_r604733557
##########
File path:
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/transactions/IgniteTxManager.java
##########
@@ -939,11 +935,13 @@ public GridNearTxLocal newTx(
});
for (IgniteInternalTx tx : activeTransactions()) {
- if (tx.dht() && !tx.local() &&
tx.originatingNodeId().equals(node.id())) {
+ if (tx.state() == PREPARED /* recovery may be needed */
+ // (primary (not on originating) or backup) || (primary on
originating).
+ && (tx.dht() || (tx.near() && tx.local() &&
((GridNearTxLocal)tx).colocatedLocallyMapped()))
Review comment:
Unfortunately, we have to use this overcomplicated statement :(
1) We should wait for all tx's (having failed node as primary for one of
tx's keys) recovery finish before finishing the switch.
Explanation of the tx flags:
- regular dth txs (txs at primary or backup, started from other node or
from backup).
- near & local & colocatedLocallyMapped, txs started from some tx key's
primary.
We have optimization to avoid dht tx creation in that case.
So, the pattern is `tx.dht() || (tx.near() && tx.local() &&
((GridNearTxLocal)tx).colocatedLocallyMapped()` :(
Any hints are welcome!
2) Why not just wait for any type of tx?
We should avoid waiting from `tx.near() && tx.local()` but not
`((GridNearTxLocal)tx).colocatedLocallyMapped()` to allow alive cells finish
their switches asap without waiting for `local & near & !collocated` txs
related to the failed node.
We may use the following statement instead:
`!(tx.near() && tx.local() &&
!((GridNearTxLocal)tx).colocatedLocallyMapped()`
but this looks less readable to me, than the current statement.
Both cases, checked by tests included in this PR
- GridExchangeFreeCellularSwitchIsolationTest #
testMutliKeyTxRecoveryHappenBeforeTheSwitchOnCellularSwitch, checks we able to
read the values for the keys participating in tx only after the recovery happen.
- GridExchangeFreeCellularSwitchIsolationTest #
testOnlyAffectedNodesWaitForRecovery [Started from = ALIVE_CELL], checks alive
cell finished the switch asap even in case tx was started from alive cell.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]