[jira] [Commented] (HADOOP-11487) NativeS3FileSystem.getStatus must retry on FileNotFoundException

Steve Loughran (JIRA) Sat, 17 Jan 2015 09:42:07 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281462#comment-14281462
 ]


Steve Loughran commented on HADOOP-11487:
-----------------------------------------

# Which version of hadoop?
# Which S3 zone? Only US-east lacks create consistency

Blobstores are the bane of our lives. They aren't real filesystems...really 
code around it needs to recognise this and act on it, though as they all have 
standard expectations of files and their metadata, that's not easy

It's not enough to retry on FS status as there are other inconsistencies: 
directory renames and deletes, blob updates, etc. 

There's a new FS client,  s3a, in hadoop 2.6 which is where all future fs/s3 
work is going on. Try it to see if it is any better, though I doubt it.

If we were to fix it, the route would be to go with something derived off 
NetFlix S3mper. Retrying on a 404 is not sufficient.

> NativeS3FileSystem.getStatus must retry on FileNotFoundException
> ----------------------------------------------------------------
>
>                 Key: HADOOP-11487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11487
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, fs/s3
>            Reporter: Paulo Motta
>
> I'm trying to copy a large amount of files from HDFS to S3 via distcp and I'm 
> getting the following exception:
> {code:java}
> 2015-01-16 20:53:18,187 ERROR [main] 
> org.apache.hadoop.tools.mapred.CopyMapper: Failure in copying 
> hdfs://10.165.35.216/hdfsFolder/file.gz to s3n://s3-bucket/file.gz
> java.io.FileNotFoundException: No such file or directory 
> 's3n://s3-bucket/file.gz'
>       at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
>       at 
> org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
>       at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
>       at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2015-01-16 20:53:18,276 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.io.FileNotFoundException: No such file or 
> directory 's3n://s3-bucket/file.gz'
>       at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
>       at 
> org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
>       at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
>       at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> {code}
> However, when I try hadoop fs -ls s3n://s3-bucket/file.gz the file is there. 
> So probably due to Amazon's S3 eventual consistency the job failure.
> In my opinion, in order to fix this problem NativeS3FileSystem.getFileStatus 
> must use fs.s3.maxRetries property in order to avoid failures like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11487) NativeS3FileSystem.getStatus must retry on FileNotFoundException

Reply via email to