[
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409534#comment-15409534
]
Thomas Demoor commented on HADOOP-9565:
---------------------------------------
Steve the "avoid data write" thing you mention is exactly why these direct
outputcommitters (and what I did for the FileOutputCommitter) work on object
stores. Multiple writers can write to the same object concurrently. At any
point, the last-started successfully-completed write is what is visible.
Regular put:
* Content length (=N) communicated at start of request.
* Once N bytes hit S3 the object becomes visible
* If hadoop task aborts before writing N bytes the upload will timeout and the
object version is garbage collected by S3.
MulitpartUpload:
* Requires explicit API call to complete (or abort)
* Only when complete API call is used the object becomes visible
* If hadoop task fails the upload will remain to be active (s3a has the purge
functionality to automatically clean these up after a certain period) but the
object is NOT visible
The interesting thing to think about are network partitions.
> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs, fs/s3, fs/swift
> Affects Versions: 2.6.0
> Reporter: Steve Loughran
> Assignee: Pieter Reuse
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch,
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch,
> HADOOP-9565-006.patch, HADOOP-9565-branch-2-007.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really
> blobstores, with different atomicity and consistency guarantees, by adding a
> {{Blobstore}} interface to add to them.
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that
> all blobstores implement at server-side copy operation as a substitute for
> rename.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]