Hi All
We have implemented MR & Spark running job performance monitoring(JPM) in
the past few weeks.
In MR/Spark JPMs, we fetch running job list from the yarn resource manager,
get job details from rest api that supported by yarn and save the parsed
results to hbase. Later, we can use these job information to analysis
performance and generate alerts.
To implement these functionalities in a streaming way, We build storm
topologies for each JPM. We set one spout to fetch running job list from
yarn and recover from zookeeper when restarted, and set some bolts to
handle each job.

For more details, please view
https://github.com/apache/incubator-eagle/pull/309

Thanks
Jinhu

Reply via email to