GitHub user egorklimov opened a pull request:

    https://github.com/apache/zeppelin/pull/3110

    [ZEPPELIN-3671] Add info about running interpreters PIDs to API and JMX

     
    ### What is this PR for?
    This is a continuation of 
[3102](https://github.com/apache/zeppelin/pull/3102)
    
    It would be nice if we could get PID of running interpreter, and group of 
paragraphs that running under that interpreter, using API and JMX.
    
    Using this feature it will be easy to check CPU and memory load, etc. 
    
    This PR adds:
    * API method to get info about running interpreters 
(`/api/interpreter/running`);
    * API method to get info about running paragraphs grouped by interpreters 
(`/api/notebook/jobmanager/running`);
    * Few JMX methods which do the same as API;
    * Template for simple running paragraphs analysis using API.
    
    Part of discussion from previous PR:
    > author: @zjffdu 
    > Thanks @egorklimov , this is an interesting feature.
    > This assume all the interpreter process will generate pid file, but this 
assumption is not true. There's 2 exceptions at least for now. One is 
yarn-cluster mode of spark where the interpreter runs in remote node of yarn 
cluster. Another is running interpreter in container which is on our roadmap. 
Do you have any ideas of how to handle these 2 scenarios ?
    
    > author: @egorklimov
    > If I'm not mistaken according to `bin/interpreter.sh:234-238`:
    > ```
    > if [[ -z "${pid}" ]]; then
    >   exit 1;
    > else
    >   echo ${pid} > ${ZEPPELIN_PID}
    > fi
    > ``` 
    > Pid file generates every time (except case when interpreter process 
didn't start successfully). But in yarn scenario it will be hard to analyze CPU 
and memory load because resources will be consumed on each node, maybe someone 
will add one more template for that case.
    > 
    > In case of container, I suppose we could generate other info in run 
folder.
    
    ### What type of PR is it?
    Improvement
    
    ### What is the Jira issue?
    Issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN-3671
    
    ### How should this be tested?
    * CI failed on [third 
test](https://travis-ci.org/TinkoffCreditSystems/zeppelin/jobs/411783078), 
because of `org.apache.zeppelin.rest.ZeppelinRestApiTest.testJobs()` with job 
time limit error;
    * Tests added.
    
    ### Screenshots
    Example of tables built on response data:
    * Interpreters:
    
![intp-table-example](https://user-images.githubusercontent.com/6136993/43404050-3e123f3c-941f-11e8-9223-056cd0d96adf.png)
    * Paragraphs:
    
![running-paragraphs-info](https://user-images.githubusercontent.com/6136993/43404049-3df6ee58-941f-11e8-9e7e-d60a1859a7de.png)
    
    Tree of processes associated with running:
    
![tree](https://user-images.githubusercontent.com/6136993/43412312-86ec9ce6-9435-11e8-8a1c-8f9d8bef4f94.png)
    
    The same data using JMX (viewed in jconsole):
    
    * Running interpreters:
    
![jconsole2](https://user-images.githubusercontent.com/6136993/43408918-2a31b306-942b-11e8-9a51-f2cf78f52a8c.png)
    * Running paragraphs
    
![jconsole-example](https://user-images.githubusercontent.com/6136993/43408919-2a550478-942b-11e8-854d-7281615f1b17.png)
    
    
    ### Questions:
    * Does the licenses files need update? Yes, Common Public License Version 
1.0 added
    * Is there breaking changes for older versions? No
    * Does this needs documentation? Yes, info about API should be updated


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/TinkoffCreditSystems/zeppelin DW-17571

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/3110.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3110
    
----
commit ede27cf5ce9583b3240ad87ac999d8151d450e66
Author: egorklimov <klim.electronicmail@...>
Date:   2018-07-24T13:28:43Z

    MBean register fixed

commit b09982228b58ebbf9a3ce2ce16f4a6bd2da9622c
Author: egorklimov <klim.electronicmail@...>
Date:   2018-07-24T16:22:20Z

    Running statistics functions added
    
    Bugs fixed

commit adde2e3e62ff9da9fd72c450535caf464417f1c4
Author: Egor Klimov <klimovgeor@...>
Date:   2018-07-30T12:00:10Z

    Add templates

commit 7eb26f67de2ee7ad9a27e0e5d1a6adff5d11a7a4
Author: egorklimov <klim.electronicmail@...>
Date:   2018-07-30T15:54:15Z

    Tests added

----


---

Reply via email to