[
https://issues.apache.org/jira/browse/ZOOKEEPER-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fangmin Lv updated ZOOKEEPER-3618:
----------------------------------
Summary: Send batch quorum Ack and Commit packets to improve the efficiency
and throughput of ZK (was: Send batch quorum Ack and Commit packets to improve
the efficiency and throughput of Zeus)
> Send batch quorum Ack and Commit packets to improve the efficiency and
> throughput of ZK
> ---------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3618
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3618
> Project: ZooKeeper
> Issue Type: Improvement
> Components: server
> Affects Versions: 3.6.0
> Reporter: Fangmin Lv
> Assignee: Fangmin Lv
> Priority: Major
> Fix For: 3.6.0
>
>
> ZK guarantees that the txns will be flushed to disk in order, and we're doing
> batch flush to improve the disk IO efficiency and throughput, but when
> sending ACK back its still sending one by one, which is not efficient,
> instead we can send the ACK for the last flushed txn to leader in batch mode.
> On leader, when it's receiving the ACK for txn N, based on the flushing order
> guarantees, all the txns before N have been flushed to disk as well, so
> they're all ACKed. The leader can then maintain the (SID -> last ACKed ZXID)
> map to calculate the latest COMMIT ZXID, and send that to all learners.
> Based on the ordering guarantee, when learner received COMMIT for txn N, it
> means all the txns before that have been committed.
> The main benefit we can get from this feature is to reduce the memory
> pressure, GC, quorum communication effort on all servers, and reduce the lock
> contention on leader when processing ACK, Commit, etc.
> Overall, this will improve the efficiency of ZK, and expect to support higher
> throughput for write traffic.
> To main challenge of this work is making sure backward compatible and also
> safe for gradually rollout, meanwhile make sure it won't affect the
> correctness/durability for txns during dynamic reconfig.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)