[ 
https://issues.apache.org/jira/browse/RATIS-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duong updated RATIS-2129:
-------------------------
    Description: 
Today, the GrpcLogAppender thread makes a lot of calls that need RaftLog's 
readLock. In an active environment, RaftLog is always busy appending 
transactions from clients, thus writeLock is frequently busy. This makes the 
replication performance slow. 

See the [^dn_echo_leader_profile.html], or in the picture below, the purple is 
the time taken to acquire readLock from RaftLog.
 # !image-2024-07-22-15-25-46-155.png|width=854,height=425!

h2. A summary of LockContention in Ratis. 
h2. 
!ratis_ratfLog_lock_contention.png|width=392,height=380!

Today, RaftLog consistency is protected by a global ReadWriteLock. (global 
means RaftLog has a single ReadWriteLock and the lock is acquired at the scope 
of the RaftLog instance, or a RaftGroup).

In a RaftGroup, the following actors content this goal ReadWriteLock in the 
leader node:
 * The writer, which is the GRPC Client Service, accepts transaction 
submissions from Raft clients and appends transactions (or log entries) to 
RaftLog. Each append operation needs to acquire the writeLock from RaftLog to 
put the transaction to RaftLog's memory queue. Although each of these append 
operations is quick, Ratis is designed to maximize transactions append and so 
the writeLock should be always busy.
 * StateMachineUpdater. For each transaction, when it is acknowledged by enough 
followers, this single thread actor will read the log from RaftLog and call 
StateMachine to apply the transaction. This actor acquires readLock from 
RaftLog for each log entry read. 
 * GrpcLogAppender: for each follower, there's a thread of GrpcLogAppender that 
constantly reads log entries from RaftLog and replicates them to the follower. 
This thread acquires readLock from RaftLog every time it reads a log entry.

All writer, StateMachineUpdater, and GrpcLogAppender are all designed in a way 
to maximize their throughput. For instance,  StateMachineUpdater invokes 
StateMachine's applyTransaction as asynchronous calls. The same is the way 
GrpcLogAppender replicates log entries to the follower. 

The global ReadWriteLock creates tough contention between the RaftLog writers 
and readers. And that slows the ratis throughput down.

 

  was:
Today, the GrpcLogAppender thread makes a lot of calls that need RaftLog's 
readLock. In an active environment, RaftLog is always busy appending 
transactions from clients, thus writeLock is frequently busy. This makes the 
replication performance slow. 

See the [^dn_echo_leader_profile.html], or in the picture below, the purple is 
the time taken to acquire readLock from RaftLog.
 # !image-2024-07-22-15-25-46-155.png|width=854,height=425!

So far, I'm not sure if this is a regression from a recent change in 
3.1.0/3.0.0, or if it's been always the case. 

A few early considerations:
 # The rate of calling RaftLog per GrpcLogAppender seems to be too high. 
Instead of calling RaftLog multiple, maybe the log appended can call once to 
obtain all the required information?
 # Can RaftLog expose those data without requiring a read lock? 


> Low replication performance because of lock contention on RaftLog
> -----------------------------------------------------------------
>
>                 Key: RATIS-2129
>                 URL: https://issues.apache.org/jira/browse/RATIS-2129
>             Project: Ratis
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.0
>            Reporter: Duong
>            Assignee: Tsz-wo Sze
>            Priority: Blocker
>              Labels: Performance, performance
>         Attachments: Screenshot 2024-07-22 at 4.40.07 PM-1.png, Screenshot 
> 2024-07-22 at 4.40.07 PM.png, dn_echo_leader_profile.html, 
> image-2024-07-22-15-25-46-155.png, ratis_ratfLog_lock_contention.png
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Today, the GrpcLogAppender thread makes a lot of calls that need RaftLog's 
> readLock. In an active environment, RaftLog is always busy appending 
> transactions from clients, thus writeLock is frequently busy. This makes the 
> replication performance slow. 
> See the [^dn_echo_leader_profile.html], or in the picture below, the purple 
> is the time taken to acquire readLock from RaftLog.
>  # !image-2024-07-22-15-25-46-155.png|width=854,height=425!
> h2. A summary of LockContention in Ratis. 
> h2. 
> !ratis_ratfLog_lock_contention.png|width=392,height=380!
> Today, RaftLog consistency is protected by a global ReadWriteLock. (global 
> means RaftLog has a single ReadWriteLock and the lock is acquired at the 
> scope of the RaftLog instance, or a RaftGroup).
> In a RaftGroup, the following actors content this goal ReadWriteLock in the 
> leader node:
>  * The writer, which is the GRPC Client Service, accepts transaction 
> submissions from Raft clients and appends transactions (or log entries) to 
> RaftLog. Each append operation needs to acquire the writeLock from RaftLog to 
> put the transaction to RaftLog's memory queue. Although each of these append 
> operations is quick, Ratis is designed to maximize transactions append and so 
> the writeLock should be always busy.
>  * StateMachineUpdater. For each transaction, when it is acknowledged by 
> enough followers, this single thread actor will read the log from RaftLog and 
> call StateMachine to apply the transaction. This actor acquires readLock from 
> RaftLog for each log entry read. 
>  * GrpcLogAppender: for each follower, there's a thread of GrpcLogAppender 
> that constantly reads log entries from RaftLog and replicates them to the 
> follower. This thread acquires readLock from RaftLog every time it reads a 
> log entry.
> All writer, StateMachineUpdater, and GrpcLogAppender are all designed in a 
> way to maximize their throughput. For instance,  StateMachineUpdater invokes 
> StateMachine's applyTransaction as asynchronous calls. The same is the way 
> GrpcLogAppender replicates log entries to the follower. 
> The global ReadWriteLock creates tough contention between the RaftLog writers 
> and readers. And that slows the ratis throughput down.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to