[ https://issues.apache.org/jira/browse/HADOOP-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jordan Mendelson updated HADOOP-9454: ------------------------------------- Attachment: HADOOP-9454-3.patch This version includes a test that uploads three files of various sizes, renames them then downloads them to compare the hashes. It tests both normal and multipart uploads and multipart copies. It will not run unless your test core-site.xml file has valid aws credentials and the test.fs.s3n.name is filled out properly (it'll just skip the tests). Also, since the only way to test the multipart copy is to upload a 5 GB file, it will take quite a while to actually run this on a non-network optimized instance (the test runner seems to end up killing it if it takes over ~10 minutes). I've included a test jets3t.properties which increases the thread count for uploading so it can do so in a reasonable amount of time. Downloading actually takes significantly longer than multipart uploading (which we can fix with parallel downloading possibly in the future?). > Support multipart uploads for s3native > -------------------------------------- > > Key: HADOOP-9454 > URL: https://issues.apache.org/jira/browse/HADOOP-9454 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 > Reporter: Jordan Mendelson > Attachments: HADOOP-9454-3.patch > > > The s3native filesystem is limited to 5 GB file uploads to S3, however the > newest version of jets3t supports multipart uploads to allow storing multi-TB > files. While the s3 filesystem lets you bypass this restriction by uploading > blocks, it is necessary for us to output our data into Amazon's > publicdatasets bucket which is shared with others. > Amazon has added a similar feature to their distribution of hadoop as has > MapR. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira