[
https://issues.apache.org/jira/browse/HADOOP-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320067#comment-15320067
]
Chris Nauroth commented on HADOOP-13241:
----------------------------------------
Steve, thank you for doing this. This is very helpful information.
{code}
this can make `seek()` slow on large files. It also does not
handle "/" in secret key. The reason there has been no attempt to fix this is
that every upgrade of the Jets3t library, while
{code}
According to comments on HADOOP-3733, S3A has this problem too, so it might not
be accurate to characterize it as an S3N-specific problem or a Jets3t problem.
{code}
* `amazon-core-java-SDK` jar.
{code}
I wasn't sure what this line meant. I see the aws-java-sdk-core dependency is
called out as a separate line item, so is this something else?
{code}
The files in an object store are not visible until the write has been completed.
when partitioned upload is in progress, they may be visible. Otherwise,
in-progress writes are simply saved to a local file and only copied up
{code}
Is the 'w' in "when" meant to be capitalized?
I thought with multi-part upload, the object is only visible after completion.
Am I mistaken?
{code}
S3 renaming is a very expensive `O(data)` operation which may fail partway
through
{code}
Perhaps specifically mention 2 specific use cases that are often flagged for
poor rename performance: the MapReduce FileOutputCommitter and DistCp's rename
after copy.
I think it would be worthwhile for the classpath section to mention the shell
profile that automatically adds hadoop-aws and its dependencies to the default
Hadoop classpath. The easiest way to get S3A on the classpath is to symlink to
that stock shell profile. However, this would require different patches for
trunk vs. branch-2. If you prefer to keep the current patch applicable to both
trunk and branch-2, then I'd be happy to pick up this part in a separate
trunk-only patch.
> document s3a better
> -------------------
>
> Key: HADOOP-13241
> URL: https://issues.apache.org/jira/browse/HADOOP-13241
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: documentation, fs/s3
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Attachments: HADOOP-13241-branch-2-001.patch
>
>
> s3a can be documented better, things like classpath, troubleshooting, etc.
> sit down and do it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]