[
https://issues.apache.org/jira/browse/HADOOP-14710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609969#comment-16609969
]
Kaidi Zhao edited comment on HADOOP-14710 at 9/11/18 12:57 AM:
---------------------------------------------------------------
The s3 end point exposed from Snowball or Snowball Edge is just a subset of
regular s3. For example, for Snowball Edge, see:
[https://docs.aws.amazon.com/snowball/latest/developer-guide/using-adapter-supported-api.html]
I was able to give it a try using using hadoop 2.7.5 as well as 2.8.4, and
noticed many command commands do not work there.
1) I tried distcp from hadoop into Snowball Edge's s3. Looks like hadoop tries
to use "PUT Object - Copy" when moving the temporary file to the final file.
But this rest api is not supported by Snowball's S3 so it errors out.
2) I also tried things like hadoop fs -ls s3a://xyz/, the command retries a
number of times, then errors out, saying something like:
ls: listStatus on s3a://xyz/: com.amazonaws.AmazonClientException: Unable to
execute HTTP request: Read timed out.
2a) Strange enough, with hadoop debug on, I can clearly see the
"ListBucketResult" object is actually returned, so I guess somehow it errors
out somewhere else.
ab) Also, if I use "s3a://xyz" instead (no back slash at the end), then the
error is like: ls: 's3a://xyz': No such file or directory.
In short, I don't see any way we can copy data directly from hdfs to Snowball
Edge's S3.
was (Author: kdzhao):
The s3 end point exposed from Snowball or Snowball Edge is just a subset of
regular s3. For example, for Snowball Edge, see:
[https://docs.aws.amazon.com/snowball/latest/developer-guide/using-adapter-supported-api.html]
I was able to give it a try using using hadoop 2.7.5 as well as 2.8.4, and
noticed many command commands do not work there.
1) I tried distcp from hadoop into Snowball Edge's s3. Looks like hadoop tries
to use "PUT Object - Copy" when moving the temporary file to the final file.
But this rest api is not supported by Snowball's S3 so it errors out.
2) I also tried things like hadoop fs -ls s3a://xyz/, the command retries a
number of times, then errors out, saying something like:
ls: listStatus on s3a://xyz/: com.amazonaws.AmazonClientException: Unable to
execute HTTP request: Read timed out.
2a) Strange enough, with hadoop debug on, I can clearly see the
"ListBucketResult" object is actually returned, so I guess somehow it errors
out somewhere else.
ab) Also, if I use "s3a://xyz" instead (no back slash at the end), then the
error is like: ls: 's3a://xyz': No such file or directory.
> Uber-JIRA: Support AWS Snowball
> -------------------------------
>
> Key: HADOOP-14710
> URL: https://issues.apache.org/jira/browse/HADOOP-14710
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: John Zhuge
> Assignee: John Zhuge
> Priority: Major
>
> Support data transfer between Hadoop and [AWS
> Snowball|http://docs.aws.amazon.com/snowball/latest/ug/whatissnowball.html].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]