[jira] [Updated] (ZOOKEEPER-3657) Implementing snapshot schedule to avoid high latency issue due to disk contention

Fangmin Lv (Jira) Thu, 19 Dec 2019 17:03:14 -0800


     [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Fangmin Lv updated ZOOKEEPER-3657:
----------------------------------
    Description: 
If ZK server is running a machine with single disk driver, the snapshot and txn 
fsync thread will have disk IO contention (even on SSD). Majority taking 
snapshot will affect the txn fsync time, and hence the end to end update and 
read latency.

To provide better SLA guarantee and improve the write throughput with large 
snapshot (> 3GB), the snapshot scheduler is implemented internally to avoid 
majority taking snapshot at the same time, which provides better latency 
guarantee.

A new quorum packet type SNAPPING is introduced in this feature, leader will 
send this packet to the followers periodically like PING but less frequently. 
Followers will send the current status back, like the maximum txns since last 
snapshot, fsync latency, etc, and leader will decide who should take snapshot.

On follower, it will enable safe snapshot mode if leader is sending SNAPPING, 
which will only take snapshot if the txns is much larger than the threshold we 
defined for SyncRequestProcessor, this is used to avoid issues like the 
follower accumulated too many txns before it is scheduled to take snapshot, or 
there are 2/5 servers down for a long time, and the leader is not issuing snap 
for a long time.

  was:
If ZK server is running a machine with single disk driver, the snapshot and txn 
fsync thread will have disk IO contention (even on SSD). Majority taking 
snapshot will affect the txn fsync time, and hence the end to end update and 
read latency.

To provide better SLA guarantee and improve the write throughput with large 
snapshot (> 3GB), the snapshot scheduler is implemented internally to avoid 
majority taking snapshot at the same time, which provides better latency 
guarantee.

A new quorum packet type SNAPPING is introduced in this feature, leader will 
send this packet to the followers periodically like PING but less frequently. 
Followers will send the current status back, like the maximum txns since last 
snapshot, fsync latency, etc, and leader will decide who should take snapshot.

On follower, it will enable safe snapshot mode if leader is sending SNAPPING, 
which will only take snapshot if the txns is much larger than the threshold we 
defined for SyncRequestProcessor, this is used to avoid issues like the 
follower accumulated too many txns before it is scheduled to take 
snapshot, or there are 2/5 servers down for a long time, and the leader is not 
issuing snap for a long time.


>  Implementing snapshot schedule to avoid high latency issue due to disk 
> contention
> ----------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3657
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3657
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: server
>            Reporter: Fangmin Lv
>            Assignee: Fangmin Lv
>            Priority: Major
>             Fix For: 3.6.0
>
>
> If ZK server is running a machine with single disk driver, the snapshot and 
> txn fsync thread will have disk IO contention (even on SSD). Majority taking 
> snapshot will affect the txn fsync time, and hence the end to end update and 
> read latency.
> To provide better SLA guarantee and improve the write throughput with large 
> snapshot (> 3GB), the snapshot scheduler is implemented internally to avoid 
> majority taking snapshot at the same time, which provides better latency 
> guarantee.
> A new quorum packet type SNAPPING is introduced in this feature, leader will 
> send this packet to the followers periodically like PING but less frequently. 
> Followers will send the current status back, like the maximum txns since last 
> snapshot, fsync latency, etc, and leader will decide who should take snapshot.
> On follower, it will enable safe snapshot mode if leader is sending SNAPPING, 
> which will only take snapshot if the txns is much larger than the threshold 
> we defined for SyncRequestProcessor, this is used to avoid issues like the 
> follower accumulated too many txns before it is scheduled to take snapshot, 
> or there are 2/5 servers down for a long time, and the leader is not issuing 
> snap for a long time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ZOOKEEPER-3657) Implementing snapshot schedule to avoid high latency issue due to disk contention

Reply via email to