[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554488#comment-13554488
 ] 

Thawan Kooburat commented on ZOOKEEPER-1147:
--------------------------------------------

Session creation flow:
The flow for local session creation is similar to global session flow. As part 
of processing ConnectionRequest, a new sessionId is generated and added to 
local session tracker right away.  The local create session request is then 
send down the request pipeline.   Because this is marked as local session 
request, it wasn’t send to the leader and it only gets processed locally. 
Similar to global session, when the request reaches the final request 
processor, the session added to session tracker and ConnectResponse is sent 
back to the client to finish session initialization. 

Session Upgrading:
Follower/Leader/Observer RequestProcessor intercepts a client request before 
entering the pipeline. If it is a request to create an ephemeral node, it will 
generate a create session request and push it into the pipeline before the 
create node request without waiting for the create session request to complete. 
The session is considered a global session right away on that machine, so a 
subsequent create ephemeral node request won’t trigger upgrade sequence 

Session Validation:
When client try to reconnect, if the server found that it existing in the local 
session tracker, the client will reconnect right away. Otherwise, the 
follower/observer will have to send a revalidate packet to the leader to 
validate against global session. 

Answer to specific question:
- When is the session created 
In a current implementation it will try to create a local session when 
processing ConnectRequest and when createSession request reach 
FinalRequestProcessor.  The later one probably result in noop, but I am not 
sure about the reason behind this design.  

- What happens if the create for session is sent at server A and the client 
disconnects to some other server B which ends up sending it again and then 
disconnects and connects back to server A.
When a client reconnects to B, its sessionId won’t exist in B’s local session 
tracker. So B will send validation packet. If CreateSession issued by A is 
committed before validation packet arrive the client will be able to connect. 
Otherwise, the client will get session expired because the quorum hasn’t know 
about this session yet. 
If the client also tries to connect back to A again, the session is already 
removed from local session tracker. So A will need to send a validation packet 
to the leader. The outcome should be the same as B depending on the timing of 
the request. 

Other note:
>From our experience so far, application that use local session will see higher 
>rate of session expire.  This is because automatic session failover is no 
>longer available. So the application need to handle this graceful and recreate 
>ZooKeeper handle. 

If the application don't want to do that, it can easily create an ephemeral 
node in order to upgrade to global session and get the automatic session 
failover like before. 
 

                
> Add support for local sessions
> ------------------------------
>
>                 Key: ZOOKEEPER-1147
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 3.3.3
>            Reporter: Vishal Kathuria
>            Assignee: Thawan Kooburat
>              Labels: api-change, scaling
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1147.patch
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.       The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.       Local sessions cannot create ephemeral nodes.
> 3.       Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.       When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.       The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to