Our scope is more than job analysis, we will cover some features supported by dr-elephant but not limited by this.
2016-06-07 20:25 GMT+08:00 Liangfei.Su <[email protected]>: > Great contribution and a lot of codes.. > > btw: does this feature comparable to > https://github.com/linkedin/dr-elephant? > > > > > On Tue, Jun 7, 2016 at 4:58 PM, 吴金虎 <[email protected]> wrote: > > > Hi, all > > > > We add the support for MR & Spark history job monitoring(JPM) for Apache > > Eagle which are used to analyze the performance of the history jobs and > > generate alerts. For now, they only contains data ingestion. > > > > For MR JPM, it reads the finished job log files from hdfs, parses the log > > and configuration files and save the results to the backend storage. We > use > > hbase now. > > > > For Spark JPM, it fetches the finished job ids from the Resource manager, > > asks the Spark history server for log file locations with the job ids, > > parses the log files and save the results to the backend storage which is > > hbase either. > > > > To meet these requirements in a streaming way and achieve higher > > availability, both MR and Spark JPM use the storm topology. The spout > reads > > MR history file logs or fetches Spark finished job ids from the Resource > > manager and the bolts handle the remaining logic. > > > > We will add features about performance of history jobs and alerts later. > > > > For more details, please view the url > > https://issues.apache.org/jira/browse/EAGLE-276 > > > > Thanks, > > Jinhu Wu > > >
