Fangmin Lv created ZOOKEEPER-3619:
-------------------------------------
Summary: Implement server side semaphore API to improve the
efficiency and throughput of coordination
Key: ZOOKEEPER-3619
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3619
Project: ZooKeeper
Issue Type: New Feature
Components: server
Affects Versions: 3.6.0
Reporter: Fangmin Lv
Assignee: Fangmin Lv
Fix For: 3.6.0
The design principle of ZK API is simple, flexible and general, it can meets
different scenarios from coordination, health member track, meta store, etc.
But there are some cost of this general design, which makes heavy and
inefficient client code for recipes like distributed and semaphore, etc.
Currently, the general client side semaphore implementation without waiting
time are:
# client A create sequential and ephemeral node N-1
# client B create sequential and ephemeral node N-2
# client A and B query all children and see if its holding the lock node with
the smallest sequential id
# since client A has smaller sequential id, its the semaphore owner (assume
semaphore value is 1)
# client B will delete the node, close the session, and probably try again
later from step 2
All the contenders will issue 4 write (create session, create lock, delete
lock, close session) and 1 read (get children), which are pretty heavy and not
scale well.
We actually hit this issue internally for one heavy semaphore use case, and we
have to create dozens of ensembles to support their traffic.
To make the semaphore recipe more efficient, we can move the semaphore
implementation to server side, where leader has all the context about who'll
win the semaphore/lock during txn preparation time, do short circuit and fail
the contender directly without proposing and committing those create/delete
lock transactions.
To implement this, we need to add new semaphore API, which suppose to replace
client side lock, leader election (semaphore value 1), and general semaphore
use cases.
We started to design and implement it recently, it will based on another big
improvement we've almost finished and will soon upstream it in ZOOKEEPER-3594
to skip proposing requests with error transactions.
Meanwhile, we'd like to hear some early feedback from the community about this
feature.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)