Perhaps we should revisit the implementation of NativeS3FileSystem so
that it doesn't always buffer the file on the client. We could have an
option to make it write directly to S3. Thoughts?

Regarding the problem with HADOOP-3733, you can work around it by
setting fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey in your
hadoop-site.xml.

Cheers,
Tom

On Fri, May 8, 2009 at 1:17 AM, Andrew Hitchcock <adpow...@gmail.com> wrote:
> Hi Ken,
>
> S3N doesn't work that well with large files. When uploading a file to
> S3, S3N saves it to local disk during write() and then uploads to S3
> during the close(). Close can take a long time for large files and it
> doesn't report progress, so the call can time out.
>
> As a work around, I'd recommend either increasing the timeout or
> uploading the files by hand. Since you only have a few large files,
> you might want to copy the files to local disk and then use something
> like s3cmd to upload them to S3.
>
> Regards,
> Andrew
>
> On Thu, May 7, 2009 at 4:42 PM, Ken Krugler <kkrugler_li...@transpac.com> 
> wrote:
>> Hi all,
>>
>> I have a few large files (4 that are 1.8GB+) I'm trying to copy from HDFS to
>> S3. My micro EC2 cluster is running Hadoop 0.19.1, and has one master/two
>> slaves.
>>
>> I first tried using the hadoop fs -cp command, as in:
>>
>> hadoop fs -cp output/<dir>/ s3n://<bucket>/<dir>/
>>
>> This seemed to be working, as I could walk the network traffic spike, and
>> temp files were being created in S3 (as seen with CyberDuck).
>>
>> But then it seemed to hang. Nothing happened for 30 minutes, so I killed the
>> command.
>>
>> Then I tried using the hadoop distcp command, as in:
>>
>> hadoop distcp hdfs://<host>:50001/<path>/<dir>/ s3://<public key>:<private
>> key>@<bucket>/<dir2>/
>>
>> This failed, because my secret key has a '/' in it
>> (http://issues.apache.org/jira/browse/HADOOP-3733)
>>
>> Then I tried using hadoop distcp with the s3n URI syntax:
>>
>> hadoop distcp hdfs://<host>:50001/<path>/<dir>/ s3n://<bucket>/<dir2>/
>>
>> Similar to my first attempt, it seemed to work. Lots of network activity,
>> temp files being created, and in the terminal I got:
>>
>> 09/05/07 18:36:11 INFO mapred.JobClient: Running job: job_200905071339_0004
>> 09/05/07 18:36:12 INFO mapred.JobClient:  map 0% reduce 0%
>> 09/05/07 18:36:30 INFO mapred.JobClient:  map 9% reduce 0%
>> 09/05/07 18:36:35 INFO mapred.JobClient:  map 14% reduce 0%
>> 09/05/07 18:36:38 INFO mapred.JobClient:  map 20% reduce 0%
>>
>> But again it hung. No network traffic, and eventually it dumped out:
>>
>> 09/05/07 18:52:34 INFO mapred.JobClient: Task Id :
>> attempt_200905071339_0004_m_000001_0, Status : FAILED
>> Task attempt_200905071339_0004_m_000001_0 failed to report status for 601
>> seconds. Killing!
>> 09/05/07 18:53:02 INFO mapred.JobClient: Task Id :
>> attempt_200905071339_0004_m_000004_0, Status : FAILED
>> Task attempt_200905071339_0004_m_000004_0 failed to report status for 602
>> seconds. Killing!
>> 09/05/07 18:53:06 INFO mapred.JobClient: Task Id :
>> attempt_200905071339_0004_m_000002_0, Status : FAILED
>> Task attempt_200905071339_0004_m_000002_0 failed to report status for 602
>> seconds. Killing!
>> 09/05/07 18:53:09 INFO mapred.JobClient: Task Id :
>> attempt_200905071339_0004_m_000003_0, Status : FAILED
>> Task attempt_200905071339_0004_m_000003_0 failed to report status for 601
>> seconds. Killing!
>>
>> In the task GUI, I can see the same tasks failing, and being restarted. But
>> the restarted tasks seem to be just hanging w/o doing anything.
>>
>> Eventually one of the tasks made a bit more progress, but then it finally
>> died with:
>>
>> Copy failed: java.io.IOException: Job failed!
>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:647)
>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:844)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:871)
>>
>> So - any thoughts on what's going wrong?
>>
>> Thanks,
>>
>> -- Ken
>> --
>> Ken Krugler
>> +1 530-210-6378
>>
>

Reply via email to