Denis Chudov created IGNITE-17872:
-------------------------------------
Summary: Fetch commit index on non-primary replicas instead of
waiting for safe time in case of RO tx on idle cluster
Key: IGNITE-17872
URL: https://issues.apache.org/jira/browse/IGNITE-17872
Project: Ignite
Issue Type: Bug
Environment: Safe time for non-primary replicas (see IGNITE-17263 )
was conceived as optimization to avoid unnecessary network hops. Safe time is
propagated from primary replica via raft appendEntries messages. When there is
constant load on cluster that is caused by RW transactions, these messages are
refreshing safe time on replicas with decent frequency, but in case of idle
cluster, or cluster with read-only load, safe time is propagated periodically
via heartbeats. This means that, if a RO transaction with read timestamp in
present or future, is trying to read a value from non-primary replica, it will
wait for safe time first, which is bound to frequency of heartbeat messages,
and hence, the duration of the read operation may be close to the period of
heartbeats. This looks weird and will cause performance issues.
Example:
Heartbeat period is 500 ms.
Current safe time on replica is 1.
We are processing read-only request with timestamp=2.
Next expected update of safe time, according to the heartbeat period, is 1 +
500 = 501.
This means that we should wait for about 499 ms (assuming the clock skew and
ping in cluster is 0) to proceed with RO request processing.
So, even though safe time is an optimization, we shouldn't use it in cases when
there are no RW transactions affecting the given replica, and the timestamp of
current RO transaction is greater than safe time. Instead of waiting of the
safe time update, we should fallback to reading index from the leader to
minimize the time of processing the current RO request.
Reporter: Denis Chudov
--
This message was sent by Atlassian Jira
(v8.20.10#820010)