[
https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jordan Mendelson updated HADOOP-10400:
--------------------------------------
Attachment: HADOOP-10400-1.patch
> Incorporate new S3A FileSystem implementation
> ---------------------------------------------
>
> Key: HADOOP-10400
> URL: https://issues.apache.org/jira/browse/HADOOP-10400
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Reporter: Jordan Mendelson
> Attachments: HADOOP-10400-1.patch
>
>
> The s3native filesystem has a number of limitations (some of which were
> recently fixed by HADOOP-9454). This patch adds an s3a filesystem which uses
> the aws-sdk instead of the jets3t library. There are a number of improvements
> over s3native including:
> - Parallel copy (rename) support (dramatically speeds up commits on large
> files)
> - AWS S3 explorer compatible empty directories files "xyz/" instead of
> "xyz_$folder$" (reduces littering)
> - Ignores s3native created _$folder$ files created by s3native and other S3
> browsing utilities
> - Supports multiple output buffer dirs to even out IO when uploading files
> - Supports IAM role-based authentication
> - Allows setting a default canned ACL for uploads (public, private, etc.)
> - Better error recovery handling
> - Should handle input seeks without having to download the whole file (used
> for splits a lot)
> This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to
> various pom files to get it to build against trunk. I've been using 0.0.1 in
> production with CDH 4 for several months and CDH 5 for a few days. The
> version here is 0.0.2 which changes around some keys to hopefully bring the
> key name style more inline with the rest of hadoop 2.x.
> *Caveats*:
> Hadoop uses a standard output committer which uploads files as
> filename.COPYING before renaming them. This can cause unnecessary performance
> issues with S3 because it does not have a rename operation and S3 already
> verifies uploads against an md5 that the driver sets on the upload request.
> While this FileSystem should be significantly faster than the built-in
> s3native driver because of parallel copy support, you may want to consider
> setting a null output committer on our jobs to further improve performance.
> Because S3 requires the file length be known before a file is uploaded, all
> output is buffered out to a temporary file first similar to the s3native
> driver.
> Due to the lack of native rename() for S3, renaming extremely large files or
> directories make take a while. Unfortunately, there is no way to notify
> hadoop that progress is still being made for rename operations, so your job
> may time out unless you increase the task timeout.
> This driver will fully ignore _$folder$ files. This was necessary so that it
> could interoperate with repositories that have had the s3native driver used
> on them, but means that it won't recognize empty directories that s3native
> has been used on.
> Statistics for the filesystem may be calculated differently than the s3native
> filesystem. When uploading a file, we do not count writing the temporary file
> on the local filesystem towards the local filesystem's written bytes count.
> When renaming files, we do not count the S3->S3 copy as read or write
> operations. Unlike the s3native driver, we only count bytes written when we
> start the upload (as opposed to the write calls to the temporary local file).
> The driver also counts read & write ops, but they are done mostly to keep
> from timing out on large s3 operations.
> This is currently implemented as a FileSystem and not a AbstractFileSystem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)