[
https://issues.apache.org/jira/browse/MAPREDUCE-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
jay vyas updated MAPREDUCE-5902:
--------------------------------
Description:
1) JobHistoryServer sometimes skips over certain history files, and ignores
serving them as completed.
2) In addition to skipping these files, the JobHistoryServer doesnt effectively
log which files are being skipped , and why.
So In addition to determining why certain types of files are skipped (file name
length doesnt appear to be the reason, rather, it appears to be that %
characters throw the JobHistoryServer filter off), we should log completed
.jhist files which are available in the mr-history/tmp directory, yet they
are skipped for some reason.
*Regarding the actual bug : Skipping completed jhist files*
We will need an author of the JobHistoryServer, I think, to chime in on what
types of paths for jobs are actually valid. It appears that at least some
characters, if in a job name, will make the jobhistoryserver skip recognition
of a completed jhist file.
*Regarding logging*
It would be extremely useful , then, to have a couple of gaurded logs at this
level of the code, so that we can see, in the log folders, why files are being
filtered out , i.e. it is due to filterint or visibility.
{noformat}
private static List<FileStatus> scanDirectory(Path path, FileContext fc,
PathFilter pathFilter) throws IOException {
path = fc.makeQualified(path);
List<FileStatus> jhStatusList = new ArrayList<FileStatus>();
RemoteIterator<FileStatus> fileStatusIter = fc.listStatus(path);
while (fileStatusIter.hasNext()) {
FileStatus fileStatus = fileStatusIter.next();
Path filePath = fileStatus.getPath();
if (fileStatus.isFile() && pathFilter.accept(filePath)) {
jhStatusList.add(fileStatus);
}
}
return jhStatusList;
}
{noformat}
*Reproducing*
I was able to reproduce this bug by writing a custom mapreduce job with a job
name, which contained % characters. I have also seen this with a version of
the Mahout ParallelALSFactorizationJob, which includes "-" characters in its
name, which wind up getting replaced by "%2D" later on at some stage in the job
pipeline.
was:
1) JobHistoryServer sometimes skips over certain history files, and ignores
serving them as completed.
2) In addition to skipping these files, the JobHistoryServer doesnt effectively
log which files are being skipped , and why.
So In addition to determining why certain types of files are skipped (file name
length doesnt appear to be the reason, rather, it appears to be that %
characters throw the JobHistoryServer filter off), we should log completed
.jhist files which are available in the mr-history/tmp directory, yet they
are skipped for some reason.
** Regarding the actual bug : Skipping completed jhist files **
We will need an author of the JobHistoryServer, I think, to chime in on what
types of paths for jobs are actually valid. It appears that at least some
characters, if in a job name, will make the jobhistoryserver skip recognition
of a completed jhist file.
** Regarding logging **
It would be extremely useful , then, to have a couple of gaurded logs at this
level of the code, so that we can see, in the log folders, why files are being
filtered out , i.e. it is due to filterint or visibility.
{noformat}
private static List<FileStatus> scanDirectory(Path path, FileContext fc,
PathFilter pathFilter) throws IOException {
path = fc.makeQualified(path);
List<FileStatus> jhStatusList = new ArrayList<FileStatus>();
RemoteIterator<FileStatus> fileStatusIter = fc.listStatus(path);
while (fileStatusIter.hasNext()) {
FileStatus fileStatus = fileStatusIter.next();
Path filePath = fileStatus.getPath();
if (fileStatus.isFile() && pathFilter.accept(filePath)) {
jhStatusList.add(fileStatus);
}
}
return jhStatusList;
}
{noformat}
** Reproducing **
I was able to reproduce this bug by writing a custom mapreduce job with a job
name, which contained % characters. I have also seen this with a version of
the Mahout ParallelALSFactorizationJob, which includes "-" characters in its
name, which wind up getting replaced by "%2D" later on at some stage in the job
pipeline.
> JobHistoryServer (HistoryFileManager) needs more debug logs, fails to pick up
> jobs with % characters in the name.
> -----------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5902
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5902
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobhistoryserver
> Reporter: jay vyas
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> 1) JobHistoryServer sometimes skips over certain history files, and ignores
> serving them as completed.
> 2) In addition to skipping these files, the JobHistoryServer doesnt
> effectively log which files are being skipped , and why.
> So In addition to determining why certain types of files are skipped (file
> name length doesnt appear to be the reason, rather, it appears to be that %
> characters throw the JobHistoryServer filter off), we should log completed
> .jhist files which are available in the mr-history/tmp directory, yet they
> are skipped for some reason.
> *Regarding the actual bug : Skipping completed jhist files*
> We will need an author of the JobHistoryServer, I think, to chime in on what
> types of paths for jobs are actually valid. It appears that at least some
> characters, if in a job name, will make the jobhistoryserver skip recognition
> of a completed jhist file.
> *Regarding logging*
> It would be extremely useful , then, to have a couple of gaurded logs at this
> level of the code, so that we can see, in the log folders, why files are
> being filtered out , i.e. it is due to filterint or visibility.
> {noformat}
> private static List<FileStatus> scanDirectory(Path path, FileContext fc,
> PathFilter pathFilter) throws IOException {
> path = fc.makeQualified(path);
> List<FileStatus> jhStatusList = new ArrayList<FileStatus>();
> RemoteIterator<FileStatus> fileStatusIter = fc.listStatus(path);
> while (fileStatusIter.hasNext()) {
> FileStatus fileStatus = fileStatusIter.next();
> Path filePath = fileStatus.getPath();
> if (fileStatus.isFile() && pathFilter.accept(filePath)) {
> jhStatusList.add(fileStatus);
> }
> }
> return jhStatusList;
> }
> {noformat}
> *Reproducing*
> I was able to reproduce this bug by writing a custom mapreduce job with a job
> name, which contained % characters. I have also seen this with a version of
> the Mahout ParallelALSFactorizationJob, which includes "-" characters in its
> name, which wind up getting replaced by "%2D" later on at some stage in the
> job pipeline.
--
This message was sent by Atlassian JIRA
(v6.2#6252)