[ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13786:
------------------------------------
    Attachment: HADOOP-13786-HADOOP-13345-001.patch

Patch 001

# modified FileOutputFormat to allow the committer to be chosen (as MRv1 did). 
Here's actually a factory which can be defined, so that you can have a 
committer factory which chooses the committer based on destination FS.
# There's an S3 factory, and a committer.
# The committer has some special methods in the FS API allowing it to bypass a 
lot of the checks before operations which the full Hadoop FS API requires. Here 
we can assume that the committer knows that the destination isn't a directory, 
doesn't want a mock parent directory created after deleting a path, etc.
# {{ITestS3AOutputCommitter}} is a clone of {{TestFileOutputCommitter}}, 
reworked to be against S3. Note it could be used as a basis for testing commits 
to other filesystems; the basic one assumes local FS.

Some of the new tests are failing; I haven't completely weaned to the new tests 
off file:// and into being able to simulate different failures of (a subclass 
of) s3.




> add output committer which uses s3guard for consistent commits to S3
> --------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-HADOOP-13345-001.patch
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to