GitHub user egorklimov opened a pull request: https://github.com/apache/zeppelin/pull/3110
[ZEPPELIN-3671] Add info about running interpreters PIDs to API and JMX ### What is this PR for? This is a continuation of [3102](https://github.com/apache/zeppelin/pull/3102) It would be nice if we could get PID of running interpreter, and group of paragraphs that running under that interpreter, using API and JMX. Using this feature it will be easy to check CPU and memory load, etc. This PR adds: * API method to get info about running interpreters (`/api/interpreter/running`); * API method to get info about running paragraphs grouped by interpreters (`/api/notebook/jobmanager/running`); * Few JMX methods which do the same as API; * Template for simple running paragraphs analysis using API. Part of discussion from previous PR: > author: @zjffdu > Thanks @egorklimov , this is an interesting feature. > This assume all the interpreter process will generate pid file, but this assumption is not true. There's 2 exceptions at least for now. One is yarn-cluster mode of spark where the interpreter runs in remote node of yarn cluster. Another is running interpreter in container which is on our roadmap. Do you have any ideas of how to handle these 2 scenarios ? > author: @egorklimov > If I'm not mistaken according to `bin/interpreter.sh:234-238`: > ``` > if [[ -z "${pid}" ]]; then > exit 1; > else > echo ${pid} > ${ZEPPELIN_PID} > fi > ``` > Pid file generates every time (except case when interpreter process didn't start successfully). But in yarn scenario it will be hard to analyze CPU and memory load because resources will be consumed on each node, maybe someone will add one more template for that case. > > In case of container, I suppose we could generate other info in run folder. ### What type of PR is it? Improvement ### What is the Jira issue? Issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN-3671 ### How should this be tested? * CI failed on [third test](https://travis-ci.org/TinkoffCreditSystems/zeppelin/jobs/411783078), because of `org.apache.zeppelin.rest.ZeppelinRestApiTest.testJobs()` with job time limit error; * Tests added. ### Screenshots Example of tables built on response data: * Interpreters:  * Paragraphs:  Tree of processes associated with running:  The same data using JMX (viewed in jconsole): * Running interpreters:  * Running paragraphs  ### Questions: * Does the licenses files need update? Yes, Common Public License Version 1.0 added * Is there breaking changes for older versions? No * Does this needs documentation? Yes, info about API should be updated You can merge this pull request into a Git repository by running: $ git pull https://github.com/TinkoffCreditSystems/zeppelin DW-17571 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zeppelin/pull/3110.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3110 ---- commit ede27cf5ce9583b3240ad87ac999d8151d450e66 Author: egorklimov <klim.electronicmail@...> Date: 2018-07-24T13:28:43Z MBean register fixed commit b09982228b58ebbf9a3ce2ce16f4a6bd2da9622c Author: egorklimov <klim.electronicmail@...> Date: 2018-07-24T16:22:20Z Running statistics functions added Bugs fixed commit adde2e3e62ff9da9fd72c450535caf464417f1c4 Author: Egor Klimov <klimovgeor@...> Date: 2018-07-30T12:00:10Z Add templates commit 7eb26f67de2ee7ad9a27e0e5d1a6adff5d11a7a4 Author: egorklimov <klim.electronicmail@...> Date: 2018-07-30T15:54:15Z Tests added ---- ---