[
https://issues.apache.org/jira/browse/IGNITE-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Evgeny Stanilovsky updated IGNITE-24530:
----------------------------------------
Description:
CPCC should not have an impact on RW transactions execution. It is proposed to
do like:
# Initiate new affinity replicator (new zone with new affinity)
# Change catalog version (time T1), seems it will be done the same time it was
locked [1]
# Wait all tx`s with beginTs < T1, need to start with
IndexNodeFinishedRwTransactionsChecker or reuse it.
# Change zone state (time T2)
# Transactions store according to timestamps must meet the following conditions:
## tx`s with beginTs < T1 are directed into 'old' store\partition
## beginTs >= T1 are directed into both
# Lets call - the task for already stored rows - store replicator. It stars
with T3 > T2. It need to copy all rows which satisfy to predicate: commitTs <
T2. Thus transactions with startTs < T1 and as follows with commitTs < T2 are
replicated (seems rows with not resolved intents need to be filtered too). Rows
with T1 <= startTs < X2 will be copied twice - through store replicator and
through affinity replicator (a bit write amplification here). Rows with startTs
>= T2 will be replicated only through affinity replicator.
# If tx coordinator is failed, tx can become in-flight i.e it`s commit can be
already enlisted into execution queue on primary replica of tx commit
partition, such tx can be commited *after* T2 and it wan`t be copied through
affinity or store replicator. We should not allow to commit an RW transaction
which is started before T1, but which tries to commit after T2, seems the same
logic but for index purposes is described [2] check also [3].
[1] https://issues.apache.org/jira/browse/IGNITE-24442
[2] https://issues.apache.org/jira/browse/IGNITE-22990
[3] schemacompat.SchemaCompatibilityValidator
was:
CPCC should not have an impact on RW transactions execution. It is proposed to
do like:
# Initiate new affinity replicator (new zone with new affinity)
# Change catalog version (time T1), seems it will be done the same time it was
locked [1]
# Wait all tx`s with beginTs < T1, need to start with
IndexNodeFinishedRwTransactionsChecker or reuse it.
# Change zone state (time T2)
# Transactions store according to timestamps must meet the following conditions:
## tx`s with beginTs < T1 are directed into 'old' store\partition
## beginTs >= T1 are directed into both
# Lets call - the task for already stored rows - store replicator. It stars
with T3 > T2. It need to copy all rows which satisfy to predicate: commitTs <
T2. Thus transactions with startTs < T1 and as follows with commitTs < T2 are
replicated (seems rows with not resolved intents need to be filtered too). Rows
with T1 <= startTs < X2 will be copied twice - through store replicator and
through affinity replicator (a bit write amplification here). Rows with startTs
>= T2 will be replicated only through affinity replicator.
[1] https://issues.apache.org/jira/browse/IGNITE-24442
> CPCC. Provide correct implementation for new affinity replication switch
> ------------------------------------------------------------------------
>
> Key: IGNITE-24530
> URL: https://issues.apache.org/jira/browse/IGNITE-24530
> Project: Ignite
> Issue Type: Task
> Components: sql
> Affects Versions: 3.0
> Reporter: Evgeny Stanilovsky
> Priority: Major
> Labels: ignite-3
>
> CPCC should not have an impact on RW transactions execution. It is proposed
> to do like:
> # Initiate new affinity replicator (new zone with new affinity)
> # Change catalog version (time T1), seems it will be done the same time it
> was locked [1]
> # Wait all tx`s with beginTs < T1, need to start with
> IndexNodeFinishedRwTransactionsChecker or reuse it.
> # Change zone state (time T2)
> # Transactions store according to timestamps must meet the following
> conditions:
> ## tx`s with beginTs < T1 are directed into 'old' store\partition
> ## beginTs >= T1 are directed into both
> # Lets call - the task for already stored rows - store replicator. It stars
> with T3 > T2. It need to copy all rows which satisfy to predicate: commitTs <
> T2. Thus transactions with startTs < T1 and as follows with commitTs < T2 are
> replicated (seems rows with not resolved intents need to be filtered too).
> Rows with T1 <= startTs < X2 will be copied twice - through store replicator
> and through affinity replicator (a bit write amplification here). Rows with
> startTs >= T2 will be replicated only through affinity replicator.
> # If tx coordinator is failed, tx can become in-flight i.e it`s commit can be
> already enlisted into execution queue on primary replica of tx commit
> partition, such tx can be commited *after* T2 and it wan`t be copied through
> affinity or store replicator. We should not allow to commit an RW transaction
> which is started before T1, but which tries to commit after T2, seems the
> same logic but for index purposes is described [2] check also [3].
> [1] https://issues.apache.org/jira/browse/IGNITE-24442
> [2] https://issues.apache.org/jira/browse/IGNITE-22990
> [3] schemacompat.SchemaCompatibilityValidator
--
This message was sent by Atlassian Jira
(v8.20.10#820010)