[ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876077#comment-15876077
 ] 

Steve Loughran commented on HADOOP-13786:
-----------------------------------------

I've realised we can also do delayed-commit object COPY too, because multipart 
PUT lets you declare each part to be a copy of a subset of another object. This 
would allow delayed commit rename() operations; committers could 

good
* can take any blob and rename it anywhere in the FS
* with parallelised PUT calls, copy time is ~ {{filesize/(parts * 6e6)}} 
seconds, same as parallel renames today.
* could be retrofitted to existing committers which rely on renames

bad
* still need a story for propagating commit metadata from tasks to job 
committer. 
* still need a story for cleaning up failed tasks and jobs
* still not O(1) in the commit

This could be used in the bit of the spark committer which can move files to 
absolute locations; tasks would do the copy and send the serialized commit data 
to the job committer. if the path {{_temporary/$attemptId/$taskId}} is used for 
the work output, then the cleanup mechanism of all work *prior to the copy* on 
task/job failure is the same as today's output committer. That doesn't cover 
the pending commit though, unless again something is saved to the FS somewhere. 
Again, something could be explicitly PUT into {{_pending/$jobId/$taskId}} for 
this, just save the pending data to a file with a UUID and both commit and 
abort can find and use it.

Proposed: add this once the core commit mech is done


> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, 
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, 
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to