Great contribution and a lot of codes.. btw: does this feature comparable to https://github.com/linkedin/dr-elephant?
On Tue, Jun 7, 2016 at 4:58 PM, 吴金虎 <[email protected]> wrote: > Hi, all > > We add the support for MR & Spark history job monitoring(JPM) for Apache > Eagle which are used to analyze the performance of the history jobs and > generate alerts. For now, they only contains data ingestion. > > For MR JPM, it reads the finished job log files from hdfs, parses the log > and configuration files and save the results to the backend storage. We use > hbase now. > > For Spark JPM, it fetches the finished job ids from the Resource manager, > asks the Spark history server for log file locations with the job ids, > parses the log files and save the results to the backend storage which is > hbase either. > > To meet these requirements in a streaming way and achieve higher > availability, both MR and Spark JPM use the storm topology. The spout reads > MR history file logs or fetches Spark finished job ids from the Resource > manager and the bolts handle the remaining logic. > > We will add features about performance of history jobs and alerts later. > > For more details, please view the url > https://issues.apache.org/jira/browse/EAGLE-276 > > Thanks, > Jinhu Wu >
