[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672184#comment-13672184
 ] 

Maysam Yabandeh commented on MAPREDUCE-5267:
--------------------------------------------

One possible fix is to change FileContext.Util#listStatus(Path) to skip the 
files/directories for which it cannot access. 
{code:java}
public FileStatus[] next(final AbstractFileSystem fs, final Path p) 
  throws IOException, UnresolvedLinkException {
  return fs.listStatus(p);
}
{code}
This would be consistent with the behavior of the local file system in 
RawLocalFileSystem#listStatus
{code:java}
File[] names = localf.listFiles();
{code}
which returns only accessible items.

Also, I was wondering if there is already a standard way of testing 
HistoryFileManager on top of hdfs. Currently, the tests in 
TestJobHistoryParsing.java are run on top of the local file system and hence do 
not reveal the kind of bugs reported in this jira. I made a first attempt of 
using MiniDFSCluster and setting its URI in remoteFS variable in conf, but it 
does not seem to be picked up by HistoryFileManager.
                
> History server should be more robust when cleaning old jobs
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-5267
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5267
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>    Affects Versions: 0.23.7, 2.0.4-alpha
>            Reporter: Jason Lowe
>
> Ran across a situation where an admin user had accidentally created a 
> directory in one of the date directories under /mapred/history/done/ that was 
> not readable by the historyserver user.  That effectively prevented the 
> history server from cleaning any jobs from that date forward, as it hit an 
> IOException trying to scan the directory and that aborted the entire clean 
> process.
> The history server should localize IOException handling to the directory/file 
> being processed and move on to the next entry in the list rather than 
> aborting the entire cleaning process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to