[
https://issues.apache.org/jira/browse/HADOOP-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887471#comment-16887471
]
Steve Loughran commented on HADOOP-16207:
-----------------------------------------
Working on this. Finally got a log. And (currently) /tmp/hadoop-yarn/staging/
doesn't exist.
Assumption: all the miniYarnClusters are sharing the same /tmp staging dir, so
that when one is shutdown while another is running, the second one fails as all
its staging files go away -in which case yes, it is a race condition. At least
this time.
{code}
(TaskAttemptListenerImpl.java:fatalError(288)) - Task:
attempt_1563401248365_0003_m_000000_0 - exited : java.io.FileNotFoundException:
File
file:/tmp/hadoop-yarn/staging/stevel/.staging/job_1563401248365_0003/job.split
does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:666)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:987)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:656)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:153)
at
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:917)
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:362)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
{code}
> Fix ITestDirectoryCommitMRJob.testMRJob
> ---------------------------------------
>
> Key: HADOOP-16207
> URL: https://issues.apache.org/jira/browse/HADOOP-16207
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, test
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Critical
>
> Reported failure of {{ITestDirectoryCommitMRJob}} in validation runs of
> HADOOP-16186; assertIsDirectory with s3guard enabled and a parallel test run:
> Path "is recorded as deleted by S3Guard"
> {code}
> waitForConsistency();
> assertIsDirectory(outputPath) /* here */
> {code}
> The file is there but there's a tombstone. Possibilities
> * some race condition with another test
> * tombstones aren't timing out
> * committers aren't creating that base dir in a way which cleans up S3Guard's
> tombstones.
> Remember: we do have to delete that dest dir before the committer runs unless
> overwrite==true, so at the start of the run there will be a tombstone. It
> should be overwritten by a success.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]