[
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536939#comment-13536939
]
Jacky007 commented on ZOOKEEPER-1147:
-------------------------------------
[quote]
1. zoo_init will take a flag indicating delayed persistent session creation.
2. Server will look at this flag and create a session that is local to the
server and not send a request to the leader.
3. Server will expose a new operation - upgradeToPersistent - that will upgrade
a local session to a persistent session. This is the first time that the leader
will become aware of this session (assuming the client is connected to a
follower)
4. If there is a zoo_create with ephemeral node, the client will send a
upgradeToPersistent request to the server before sending the create ephemeral
node request. This request would be async, so I don't expect it to delay the
creation of ephemeral node much.
[quote]
There is a problem I can see.
If A is a local session, now A wants to create a ephemeral node. A sends
upgradeToPersistent async and create /eph_1 to server 1, then A is disconnected
with server 1, and try to renew session on server 2. If leader receives the
message as follows: renew, upgradeToPersistent, create /eph_1,then A will
first get a session timeout, and finally /eph_1 is created which is unexpected.
> Add support for local sessions
> ------------------------------
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
> Issue Type: Improvement
> Components: server
> Affects Versions: 3.3.3
> Reporter: Vishal Kathuria
> Assignee: Thawan Kooburat
> Labels: api-change, scaling
> Fix For: 3.5.0
>
> Original Estimate: 840h
> Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale.
> We are planning on having about a 1 million clients connect to a ZooKeeper
> ensemble through a set of 50-100 observers. Majority of these clients are
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is
> handled like any other update. In the above use case, the session create/drop
> workload can easily overwhelm an ensemble. The following is a proposal for a
> "local session", to support a larger number of connections.
> 1. The idea is to introduce a new type of session - "local" session. A
> "local" session doesn't have a full functionality of a normal session.
> 2. Local sessions cannot create ephemeral nodes.
> 3. Once a local session is lost, you cannot re-establish it using the
> session-id/password. The session and its watches are gone for good.
> 4. When a local session connects, the session info is only maintained
> on the zookeeper server (in this case, an observer) that it is connected to.
> The leader is not aware of the creation of such a session and there is no
> state written to disk.
> 5. The pings and expiration is handled by the server that the session
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they
> want.
> 2. All sessions connect as local sessions and automatically get promoted to
> global sessions when they do an operation that requires a global session
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I
> don't think that would work in our case, where we want to keep sessions which
> never create ephemeral nodes as always local. Option 2 would make it more
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a
> client flag, IsLocalSession (much like the current readOnly flag) that would
> be used to determine whether to create a local session or a global session.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira