Yiheng Wang created LIVY-718:
--------------------------------

             Summary: Support multi-active high availability in Livy
                 Key: LIVY-718
                 URL: https://issues.apache.org/jira/browse/LIVY-718
             Project: Livy
          Issue Type: Epic
          Components: RSC, Server
            Reporter: Yiheng Wang


In this JIRA we want to discuss how to implement multi-active high availability 
in Livy.

Currently, Livy only supports single node recovery. This is not sufficient in 
some production environments. In our scenario, the Livy server serves many 
notebook and JDBC services. We want to make Livy service more fault-tolerant 
and scalable.

There're already some proposals in the community for high availability. But 
they're not so complete or just for active-standby high availability. So we 
propose a multi-active high availability design to achieve the following goals:
# One or more servers will serve the client requests at the same time.
# Sessions are allocated among different servers.
# When one node crashes, the affected sessions will be moved to other active 
services.

Here's our design document, please review and comment:
https://docs.google.com/document/d/1bD3qYZpw14_NuCcSGUOfqQ0pqvSbCQsOLFuZp26Ohjc/edit?usp=sharing
 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to