[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876077#comment-15876077 ]
Steve Loughran commented on HADOOP-13786: ----------------------------------------- I've realised we can also do delayed-commit object COPY too, because multipart PUT lets you declare each part to be a copy of a subset of another object. This would allow delayed commit rename() operations; committers could good * can take any blob and rename it anywhere in the FS * with parallelised PUT calls, copy time is ~ {{filesize/(parts * 6e6)}} seconds, same as parallel renames today. * could be retrofitted to existing committers which rely on renames bad * still need a story for propagating commit metadata from tasks to job committer. * still need a story for cleaning up failed tasks and jobs * still not O(1) in the commit This could be used in the bit of the spark committer which can move files to absolute locations; tasks would do the copy and send the serialized commit data to the job committer. if the path {{_temporary/$attemptId/$taskId}} is used for the work output, then the cleanup mechanism of all work *prior to the copy* on task/job failure is the same as today's output committer. That doesn't cover the pending commit though, unless again something is saved to the FS somewhere. Again, something could be explicitly PUT into {{_pending/$jobId/$taskId}} for this, just save the pending data to a file with a UUID and both commit and abort can find and use it. Proposed: add this once the core commit mech is done > Add S3Guard committer for zero-rename commits to consistent S3 endpoints > ------------------------------------------------------------------------ > > Key: HADOOP-13786 > URL: https://issues.apache.org/jira/browse/HADOOP-13786 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 > Affects Versions: HADOOP-13345 > Reporter: Steve Loughran > Assignee: Steve Loughran > Attachments: HADOOP-13786-HADOOP-13345-001.patch, > HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, > HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, > HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch > > > A goal of this code is "support O(1) commits to S3 repositories in the > presence of failures". Implement it, including whatever is needed to > demonstrate the correctness of the algorithm. (that is, assuming that s3guard > provides a consistent view of the presence/absence of blobs, show that we can > commit directly). > I consider ourselves free to expose the blobstore-ness of the s3 output > streams (ie. not visible until the close()), if we need to use that to allow > us to abort commit operations. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org