[
https://issues.apache.org/jira/browse/ZOOKEEPER-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982441#comment-16982441
]
Jordan Zimmerman commented on ZOOKEEPER-3619:
---------------------------------------------
[~lvfangmin] thanks for the info - let me know if I can help
> Implement server side semaphore API to improve the efficiency and throughput
> of coordination
> ---------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3619
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3619
> Project: ZooKeeper
> Issue Type: New Feature
> Components: server
> Affects Versions: 3.6.0
> Reporter: Fangmin Lv
> Assignee: Fangmin Lv
> Priority: Major
> Fix For: 3.6.0
>
>
> The design principle of ZK API is simple, flexible and general, it can meets
> different scenarios from coordination, health member track, meta store, etc.
> But there are some cost of this general design, which makes heavy and
> inefficient client code for recipes like distributed and semaphore, etc.
> Currently, the general client side semaphore implementation without waiting
> time are:
> # client A create sequential and ephemeral node N-1
> # client B create sequential and ephemeral node N-2
> # client A and B query all children and see if its holding the lock node
> with the smallest sequential id
> # since client A has smaller sequential id, its the semaphore owner (assume
> semaphore value is 1)
> # client B will delete the node, close the session, and probably try again
> later from step 2
> All the contenders will issue 4 write (create session, create lock, delete
> lock, close session) and 1 read (get children), which are pretty heavy and
> not scale well.
> We actually hit this issue internally for one heavy semaphore use case, and
> we have to create dozens of ensembles to support their traffic.
> To make the semaphore recipe more efficient, we can move the semaphore
> implementation to server side, where leader has all the context about who'll
> win the semaphore/lock during txn preparation time, do short circuit and fail
> the contender directly without proposing and committing those create/delete
> lock transactions.
> To implement this, we need to add new semaphore API, which suppose to replace
> client side lock, leader election (semaphore value 1), and general semaphore
> use cases.
> We started to design and implement it recently, it will based on another big
> improvement we've almost finished and will soon upstream it in ZOOKEEPER-3594
> to skip proposing requests with error transactions.
> Meanwhile, we'd like to hear some early feedback from the community about
> this feature.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)