[ 
https://issues.apache.org/jira/browse/HADOOP-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

[EMAIL PROTECTED] updated HADOOP-882:
-------------------------------------

    Attachment: jets3t-upgrade.patch
                jets3t-0.5.0.jar

Here's a patch that makes the minor changes necessary so the s3 implementation 
can use the new 0.5.0 jets3t 'retrying' lib.  It also exposes fs.s3.block.size 
in hadoop-default.xml with a note about how to set the jets3t 
RepeatableInputStream buffer size by adding a jets3t.properties to 
${HADOOP_HOME}/conf.  Setting this latter buffer to the same as the s3 block 
size avoids failures of the kind 'Input stream is not repeatable as 1048576 
bytes have been written, exceeding the available buffer size of 131072'.

Downside to this patch's approach is that if you want to match block and buffer 
size, you need to set the same value in two places: once in hadoop-site and 
again in jets3t.properties.  This seemed to be me better than the alternative, 
a tighter coupling bubbling the main jets3t properties up into hadoop-*.xml 
filesystem section as fs.s3.jets3t.XXX properties with the init of the s3 
filesystem setting the values into the  org.jets3t.service.Jets3tProperties.

I didn't change the default S3 block size from 1MB.  Setting it to 64MB seems 
too far afield from the default jets3t RepeatableInputStream size of 100k only.

I've included the 0.5.0 jets3t lib as part of the upload (There doesn't seem to 
be a way to include binaries using svn diff).  Its license is apache 2.

Tom White, thanks for pointing me at the unit test.  Also, I'd go along with 
closing this issue with the update of jets3t lib  opening another issue for 
tracking the S3 filesystems implementing a general, 'traffic-level' hadoop 
retry mechanism.

> S3FileSystem should retry if there is a communication problem with S3
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-882
>                 URL: https://issues.apache.org/jira/browse/HADOOP-882
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.10.1
>            Reporter: Tom White
>         Assigned To: Tom White
>         Attachments: jets3t-0.5.0.jar, jets3t-upgrade.patch
>
>
> File system operations currently fail if there is a communication problem 
> (IOException) with S3. All operations that communicate with S3 should retry a 
> fixed number of times before failing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to