[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755057#comment-15755057 ]
Steve Loughran commented on HADOOP-13786: ----------------------------------------- A few of us (Thomas, Pieter, Mingliang, Aaron and Sean) had a quick conf call earlier this week, where Thomas and Pieter outlined their proposed algorithm for implementing zero-rename commits to any consistent S3 endpoint. I've written up [my interpretation of the algorithm|https://github.com/steveloughran/hadoop/blob/s3guard/HADOOP-13786-committer/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/delayed-put-commit.md] for review, looking though the Hadoop and Spark commit code to see what appears to be going on, though I do need to actually document the various algorithms better. Comments welcome, especially those containing proofs of correctness > add output committer which uses s3guard for consistent commits to S3 > -------------------------------------------------------------------- > > Key: HADOOP-13786 > URL: https://issues.apache.org/jira/browse/HADOOP-13786 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.0.0-alpha2 > Reporter: Steve Loughran > Assignee: Steve Loughran > > A goal of this code is "support O(1) commits to S3 repositories in the > presence of failures". Implement it, including whatever is needed to > demonstrate the correctness of the algorithm. (that is, assuming that s3guard > provides a consistent view of the presence/absence of blobs, show that we can > commit directly). > I consider ourselves free to expose the blobstore-ness of the s3 output > streams (ie. not visible until the close()), if we need to use that to allow > us to abort commit operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org