[ https://issues.apache.org/jira/browse/LIVY-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014846#comment-17014846 ]
Saisai Shao commented on LIVY-718: ---------------------------------- Active-active HA doesn't only address scalability issue, but also high availability. Personally I don't feel super useful for active-standby HA about Livy. Usually it is because master node has large amount of state to maintain, so it is hard to implement active-active HA with consistency. If this is not the case, then active-active HA is better both for HA and scalability. > Support multi-active high availability in Livy > ---------------------------------------------- > > Key: LIVY-718 > URL: https://issues.apache.org/jira/browse/LIVY-718 > Project: Livy > Issue Type: Epic > Components: RSC, Server > Reporter: Yiheng Wang > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In this JIRA we want to discuss how to implement multi-active high > availability in Livy. > Currently, Livy only supports single node recovery. This is not sufficient in > some production environments. In our scenario, the Livy server serves many > notebook and JDBC services. We want to make Livy service more fault-tolerant > and scalable. > There're already some proposals in the community for high availability. But > they're not so complete or just for active-standby high availability. So we > propose a multi-active high availability design to achieve the following > goals: > # One or more servers will serve the client requests at the same time. > # Sessions are allocated among different servers. > # When one node crashes, the affected sessions will be moved to other active > services. > Here's our design document, please review and comment: > https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)