Lee moon soo created ZEPPELIN-3994:
--------------------------------------

             Summary: Notebook serving
                 Key: ZEPPELIN-3994
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3994
             Project: Zeppelin
          Issue Type: New Feature
            Reporter: Lee moon soo


h2. Motivation

Notebook is useful for interactive analysis. But bringing a model in a Note to 
production is another challenge. Often, we see two approaches

a. Call paragraph using Rest API to ZeppelinServer
b. Reimplement model outside the notebook and deploy it without using Zeppelin.

Approach a) brings some headache because of Zeppelin server is not fault 
tolerant and can be restarted at any moment intentionally (for changing 
configuration, etc). Also notebook to be run through REST API can be changed at 
any time without notice.

Approach b) has a clear down side. "re-implementation".

So, It'll be great if a Note can be deployed independently from ZeppelinServer 
with RestAPI endpoint, with high availability support. It makes following 
usecase really easy and reliable for production use
 # Create model in Zeppelin (or any function)
 # Click deploy button in a Note
 # Access model (any function) using REST api.

h2. Requirements
 * Note can be deployed through GUI in a single click ('Deploy' button on every 
Note)
 * Once Note is deployed, it runs independently from ZeppelinServer. Deployed 
Note should keep up and running even if ZeppelinServer is restarting/stopped.
 * ZeppelinServer provide a GUI to manage deployments
 * Deployed note should be highly available.
 * Runs on Kubernetes

h2. Design
h3. Deploy button

Each note have a 'Deploy' button.
h3. On deploy button click

On 'Deploy' button click, following will happen.
 * ZeppelinServer creates a new Pod
 ** The pod snapshot Notebook directory mount it.
 ** The pod runs another ZeppelinServer
 ** Current login session information is transferred to the new ZeppelinServer
 * ZeppelinServer in a new Pod runs paragraphs in the Note

 ** Depends on Note, it'll run multiple interpreters.
 ** Interpreters are running in Kubernetes Deployment Resource instead of Pod 
resource.
 * Once all paragraphs are successfully run, delete a new ZeppelinServer Pod 
and all other Interpreter Deployment resource except for default Interpreter 
Deployment resource of the Note.

For  example, when A note has Markdown and Python paragraph with python as 
default interpreter of a Note, then ZeppelinServer, python and Markdown 
interpreter will be created after 'deploy' button click, but ZeppelinServer and 
Markdown interpreter will be terminated after successful note run. And Python 
interpreter will be remained for serving.

h3. Register model (or any function) to be served

Each interpreter has a 
[ResourcePool|https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/ResourcePool.java].
 And ResourcePool is programmatically accessible by enduser. For example, user 
can register arbitrary object into ResourcePool using API.
{code:java}
%spark
val myModel = ....
z.put("model_1", myModel){code}
h3. Access model (or any function) from Rest api Endpoint

Every object registered in the ResourcePool is wrapped by 
[Resource|https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/Resource.java#L151].
 Resource provide method invocation using 
[Resource.invokeMethod()|https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/Resource.java#L151].

RemoteInterpreterServer provide a rest api endpoint to call method of Resource 
in its ResourcePool. Such as
{code:java}
<interpreteraddress>/<resource name>/<method name>{code}
and pass parameters of method using POST param or something.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to