Hi, all

We add the support for MR & Spark history job monitoring(JPM) for Apache
Eagle which are used to analyze the performance of the history jobs and
generate alerts. For now, they only contains data ingestion.

For MR JPM, it reads the finished job log files from hdfs, parses the log
and configuration files and save the results to the backend storage. We use
hbase now.

For Spark JPM, it fetches the finished job ids from the Resource manager,
 asks the Spark history server for log file locations with the job ids,
parses the log files and save the results to the backend storage which is
hbase either.

To meet these requirements in a streaming way and achieve higher
availability, both MR and Spark JPM use the storm topology. The spout reads
MR history file logs or fetches Spark finished job ids from the Resource
manager and the bolts handle the remaining logic.

We will add features about performance of history jobs and alerts later.

For more details, please view the url
https://issues.apache.org/jira/browse/EAGLE-276

Thanks,
Jinhu Wu

Reply via email to