[
https://issues.apache.org/jira/browse/MAPREDUCE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673195#comment-13673195
]
Jason Lowe commented on MAPREDUCE-5267:
---------------------------------------
The proposed listStatus change would help but not be sufficient to prevent
problems from cropping up. For example, someone accidentally creating a
directory under /mapred/done where the history server can read the directory
but not delete any files underneath it would still cause problems. Bottom line
is that HistoryFileManager.clean needs to protect itself from IOExceptions that
can occur when interacting with the filesystem and try to make cleanup progress
despite those exceptions.
The listStatus inconsistency is best handled by a separate JIRA.
> History server should be more robust when cleaning old jobs
> -----------------------------------------------------------
>
> Key: MAPREDUCE-5267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5267
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobhistoryserver
> Affects Versions: 0.23.7, 2.0.4-alpha
> Reporter: Jason Lowe
>
> Ran across a situation where an admin user had accidentally created a
> directory in one of the date directories under /mapred/history/done/ that was
> not readable by the historyserver user. That effectively prevented the
> history server from cleaning any jobs from that date forward, as it hit an
> IOException trying to scan the directory and that aborted the entire clean
> process.
> The history server should localize IOException handling to the directory/file
> being processed and move on to the next entry in the list rather than
> aborting the entire cleaning process.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira