Hi All,
I have HistoryServer up and running, and it is great.
Is it possible to also enable HsitoryServer to parse failed jobs event by 
default as well?
I get "No Completed Applications Found" if job fails.
=========Event Log Location: hdfs:///user/test01/spark/logs/No Completed 
Applications Found=========
The reason is that it is good to run the HistoryServer to keep track of 
performance and resource usage for each completed job, but I found it more 
useful when job fails. I can identify which stage did it fail, etc instead of 
sipping through the logs from the Resource Manager. The same event log is only 
available when the Application Master is still active, once the job fails, the 
Application Master is killed, and I lose the GUI access, even though I have the 
event log in JSON format, I can't open it with the HistoryServer.
This is very helpful especially for long running jobs that last for 2-18 hours 
that generates Gigabytes of logs.
So I have 2 questions:
1. Any reason why we only render completed jobs? Why can't we bring in all jobs 
and choose from the GUI? Like a time machine to restore the status from the 
Application Master?








./core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala 







val logInfos = logDirs
          .sortBy { dir => getModificationTime(dir) }
          .map { dir => (dir, 
EventLoggingListener.parseLoggingInfo(dir.getPath, fileSystem)) }
          .filter { case (dir, info) => info.applicationComplete }

2. If I force to touch a file "APPLICATION_COMPLETE" in the failed job event 
log folder, will this cause any problem?











                                          

Reply via email to