[ 
https://issues.apache.org/jira/browse/LIVY-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey(Xilang) Yan closed LIVY-698.
------------------------------------
    Resolution: Duplicate

> Cluster support for Livy
> ------------------------
>
>                 Key: LIVY-698
>                 URL: https://issues.apache.org/jira/browse/LIVY-698
>             Project: Livy
>          Issue Type: New Feature
>          Components: Server
>    Affects Versions: 0.6.0
>            Reporter: Jeffrey(Xilang) Yan
>            Priority: Major
>
> This is a proposal for livy cluster support, it is designed to be a light 
> solution for livy cluster which can compatible with non-cluster livy and 
> different HA level. 
> Server ID:  integer configured by livy.server.id, with default value 0 which 
> is standalone mode. Lagest server id is 213, reason described below.
> Session ID: session ID is generated as 3 digit server ID and 7 digit 
> auto-increment integer. As the biggest integer is  2,147,483,647, so largest 
> server ID is 213 and each server can have 9,999,999 sessions. Limitation here 
> is each cluster can have most 213 instance which I think is enough. For 
> standalone mode, as server id is 0, so works the same.
> Zookeeper: as zookeeper is required by config livy.server.zookeeper.quorum
> Server Registration: each server should register themself to zookeeper path 
> /livy/server/ 
> Leader: one of livy server is elected as leader of cluster on zookeeper path 
> /livy/leader/
> Coordination between servers:  servers don't talk with earch other directly, 
> server just detect which livy server the session lives on, and send http 307 
> redirect to the correct livy server.  For example, if server A receive a 
> request  
> [http://serverA:8999/sessions/1010000001|http://servera:8999/sessions/1010000001],
>  server A know session is on server B, then it send a 307 redirect to 
> [http://serverB:8999/sessions/1010000001|http://serverb:8999/sessions/1010000001].
>  
>  
> Session HA: consider server failure case, user should be able to decide if 
> want to keep sessions on failure server. This lead to two different mode:
> Non session HA mode:
> With this mode, session lost when sever failed(but it can still work with 
> session-store, recover session when server get well)
> request redirect: Sever detect session's correct server just by get the first 
> 3 digit of session id.
>  Session HA mode:
> With this mode,  session information will be persistent to ZK store(reuse 
> current session-store code), and recover in another server when server failed.
> session registration:  all sessions are registered to zk path 
> /livy/session/type/
> request redirect: each server detect correct server of session by read zk and 
> then send 307 redirect, server may cache session-server pair for a while
> server failure: cluster leader should detect server failure, reassign session 
> on failure server to other servers, other server should recover session by 
> read session information in ZK
> server notification: server need to send msg to other server, for example 
> leader send command to ask server to recover session. all such msgs are sent 
> through zk, in path /livy/notification/
>  
> Thrift Server: Thrift session has thrift session ID, thrift session can be 
> restored with same session ID, however current hive-jdbc doesn't support 
> restore session. So for thrift server, just register server to ZK to 
> implement same server discovery as hive have. Later when hive-jdbc support 
> restore session, thrift server can use exisitng livy cluster solution to 
> adapt hive-jdbc.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to