[
https://issues.apache.org/jira/browse/IGNITE-17872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denis Chudov updated IGNITE-17872:
----------------------------------
Issue Type: Improvement (was: Bug)
> Fetch commit index on non-primary replicas instead of waiting for safe time
> in case of RO tx on idle cluster
> ------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-17872
> URL: https://issues.apache.org/jira/browse/IGNITE-17872
> Project: Ignite
> Issue Type: Improvement
> Reporter: Denis Chudov
> Priority: Major
> Labels: ignite-3, transaction3_ro
>
> Safe time for non-primary replicas (see IGNITE-17263 ) was conceived as
> optimization to avoid unnecessary network hops. Safe time is propagated from
> primary replica via raft appendEntries messages. When there is constant load
> on cluster that is caused by RW transactions, these messages are refreshing
> safe time on replicas with decent frequency, but in case of idle cluster, or
> cluster with read-only load, safe time is propagated periodically via
> heartbeats. This means that, if a RO transaction with read timestamp in
> present or future, is trying to read a value from non-primary replica, it
> will wait for safe time first, which is bound to frequency of heartbeat
> messages, and hence, the duration of the read operation may be close to the
> period of heartbeats. This looks weird and will cause performance issues.
> Example:
> Heartbeat period is 500 ms.
> Current safe time on replica is 1.
> We are processing read-only request with timestamp=2.
> There were no RW transactions for some time, and the next expected update of
> safe time, according to the heartbeat period, is 1 + 500 = 501.
> This means that we should wait for about 499 ms (assuming the clock skew and
> ping in cluster is 0) to proceed with RO request processing.
> So, even though safe time is an optimization, we shouldn't use it in cases
> when there are no RW transactions affecting the given replica, and the
> timestamp of current RO transaction is greater than safe time. Instead of
> waiting for the safe time update, we should fallback to reading index from
> the leader to minimize the time of processing the current RO request.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)