Xun REN created HADOOP-17359:
--------------------------------
Summary: [Hadoop-Tools]S3A MultiObjectDeleteException after
uploading a file
Key: HADOOP-17359
URL: https://issues.apache.org/jira/browse/HADOOP-17359
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 2.10.0
Reporter: Xun REN
Hello,
I am using org.apache.hadoop.fs.s3a.S3AFileSystem as implementation for S3
related operation.
When I upload a file onto a path, it returns an error:
{code:java}
20/11/05 11:49:13 ERROR s3a.S3AFileSystem: Partial failure of delete, 1
errors20/11/05 11:49:13 ERROR s3a.S3AFileSystem: Partial failure of delete, 1
errorscom.amazonaws.services.s3.model.MultiObjectDeleteException: One or more
objects could not be deleted (Service: null; Status Code: 200; Error Code:
null; Request ID: 767BEC034D0B5B8A; S3 Extended Request ID:
JImfJY9hCl/QvninqT9aO+jrkmyRpRcceAg7t1lO936RfOg7izIom76RtpH+5rLqvmBFRx/++g8=;
Proxy: null), S3 Extended Request ID:
JImfJY9hCl/QvninqT9aO+jrkmyRpRcceAg7t1lO936RfOg7izIom76RtpH+5rLqvmBFRx/++g8= at
com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:2287)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:1137)
at org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:1389)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.deleteUnnecessaryFakeDirectories(S3AFileSystem.java:2304)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.finishedWrite(S3AFileSystem.java:2270)
at
org.apache.hadoop.fs.s3a.S3AFileSystem$WriteOperationHelper.writeSuccessful(S3AFileSystem.java:2768)
at
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:371)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:69) at
org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:128) at
org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:488)
at
org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:410)
at
org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
at
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
at
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:327) at
org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:299) at
org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:257)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:281) at
org.apache.hadoop.fs.shell.Command.processArguments(Command.java:265) at
org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:228)
at
org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:285)
at
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119) at
org.apache.hadoop.fs.shell.Command.run(Command.java:175) at
org.apache.hadoop.fs.FsShell.run(FsShell.java:317) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at
org.apache.hadoop.fs.FsShell.main(FsShell.java:380)20/11/05 11:49:13 ERROR
s3a.S3AFileSystem: bv/: "AccessDenied" - Access Denied
{code}
The problem is that Hadoop tries to create fake directories to map with S3
prefix and it cleans them after the operation. The cleaning is done from the
parent folder until the root folder.
If we don't give the corresponding permission for some path, it will encounter
this problem:
[https://github.com/apache/hadoop/blob/rel/release-2.10.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2296-L2301]
During uploading, I don't see any "fake" directories are created. Why should we
clean them if it is not really created ?
It is the same for the other operations like rename or mkdir where the
"deleteUnnecessaryFakeDirectories" method is called.
Maybe the solution is to check the deleting permission before it calls the
deleteObjects method.
To reproduce the problem:
# With a bucket named my_bucket, we have the path s3://my_bucket/a/b/c inside
# The corresponding user has only permission on the path b and sub-path inside.
# We do the command "hdfs dfs -mkdir s3://my_bucket/a/b/c/d"
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]