[
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208754#comment-16208754
]
Aaron Fabbri commented on HADOOP-13786:
---------------------------------------
Feeling another wave coming on..
Documentation is awesome, good work. Funny that the thorough treatment of
output commit is living here. We may want to link from MR site docs.
{noformat}
+or it is at the destination, -in which case the rename actually successed.
{noformat}
/successed/suceeded/. I do like "successed" though, colloquially.
{noformat}
+1. Workers execute "Tasks", which are fractiona of the job, a job whose
{noformat}
/fractiona/fractions/
{noformat}
+This then is the problem which the S3Guard committers address: how to safely
...
+## Meet the S3Guard Commmitters
{noformat}
/S3Guard committers/S3A committers/ ? If so search/replace other spots.
{noformat}
+attempts to write to specifci "magic" paths are interpreted as writes
{noformat}
/specifci/specific/
{noformat}
+committed, VMs using the committer do not need to so much storage
+But the committer needs a "consistent" S3 store (via S3Guard when working with
{noformat}
Capitalization / punctuation.
{noformat}
+does meanthat the subsequent readers of work *will* need a consistent listing
{noformat}
/meanthat/mean that/
{noformat}
+All of the S3A committers revert to provide a classic FileOutputCommitter
instance
{noformat}
/provide/providing/
{noformat}
+factories to be used to create a committer, based on the specific value of
theoption `fs.s3a.committer.name`:
{noformat}
/theoption/the option/
{noformat}
+This committer an extension of the "Directory" committer which has a special
conflict resolution
{noformat}
/an/is an/
{noformat}
+The Jjb is configured to use the magic committer, but the S3A bucket has not
been explicitly
{noformat}
/Jjb/job/
{noformat}
+This can be done for those buckets which are known to be consistent, either
+because the [S3Guard](s3guard.html) is used to provide consistency,
{noformat}
append: "or because the S3-compatible filesystem is known to be strongly
consistent"
{noformat}
+protocol for the files listing the pending write operations. Tt uses
{noformat}
/Tt/It/
{noformat}
+One way to find out what is happening (i.e. get a stack trace of where the
committer
+is being created is to set the option
`mapreduce.fileoutputcommitter.algorithm.version`
+to a value such as "10".
{noformat}
Nice trick. :-)
{noformat}
+There is an expectation that the Job Driver and tasks can communicate: if a
task
+perform any operations itself during the task commit phase, it shall only do
{noformat}
/perform/performs/
{noformat}
+* For each attempt, and attempt ID is created, to build the job attempt ID.
{noformat}
/and attempt ID/an attempt ID/
{noformat}
+during task the task commit algorithm. When working with object stores which
{noformat}
remove first "task"
{noformat}
+ Historically then, it is the "v2 algorithm"
{noformat}
/v2 algorithm/second version of an algorithm/.. found it confusing on first
read. Think this helps.
{noformat}
+Resource Manager with in `yarn.app.mapreduce.am.ob.committer.commit-window`
milliseconds
+(default: 10,000). It does this by waiting until the next heartbeat it
received.
+There's a s possible bug here: if the interval is set too small the thread may
{noformat}
/a s/a/
{noformat}
+permanently spin waiting a callback within the window. Ignoring that, this
algorithm
{noformat}
Interesting stuff. Also, besides not being able to begin commit within the
window, on non-realtime OS / storage there is no bound on time between
beginning the commit and actually doing it, technically.
{noformat}
+The AM may call the `OutputCommitter.taskAbort` on with a task attempt context,
{noformat}
/on with/?/
{noformat}
+1. There is a PUT operation, capable of uploading 5GB of data in one HTTP
request.
+1. The PUT operation is atomic, but there is no PUT-no-overwrite option.
{noformat}
And multiple clients racing to write same path cannot detect which one actually
succeeded via return code (IIRC this is determined by S3 service-side timestamp
and not visible to clients). Am I right or mistaken here?
{noformat}
+metadata. Other implementations of the S3 protocol are fully consistent; the
{noformat}
clarity nit: /implementations of the S3 protocol/S3 protocol-compatible storage
systems/
{noformat}
+The simplicity of the Stocator committer is somethign to appreciate.
{noformat}
/somthign/something/
{noformat}
+the server's local filesystem can add/update an entry the index table of the
{noformat}
/can/and/
{noformat}
+Ryan Blue, of Netflix, has submitted an alternate committer, one which has a
{noformat}
/Ryan Blue/Ryan Blue, the Duke of Netflix and quite a fine gentleman whom we
admire deeply/
(Optional change.)
{noformat}
+**Failure during task execution**
+
+All data is written to local temporary files; these need to be cleaned up.
+
+The job must ensure that the local (pending) data is purged. *TODO*: test this
{noformat}
TODO here. Possible subtask for someone to help with once we have a feature
branch?
{noformat}
+**Failure to communicate with S3 during data upload**
+
+If an upload fails, tasks will
+* attempt to abort PUT requests already uploaded to S3
{noformat}
Mention retry policy here? dunno.
{noformat}
+1. It requires a path element, such as `__magic` which cannot be used
+for any purpose other than for the storage of pending commit data.
{noformat}
Aside: We could make this string configurable in the future, no?
{noformat}
+
+### Changes to `S3ABlockOutputStream`
+
+We can avoid having to copy and past the `S3ABlockOutputStream` by
{noformat}
/past/paste/
Also, this sounds outdated vs. the code. May want to make a pass on some of
the tentative-sounding stuff you wrote that has since been resolved.
{noformat}
+Update: There is a cost to this: MRv1 support is lost wihtout
{noformat}
typo at end of line.
{noformat}
+Thr committers can only be tested against an S3-compatible object store.
{noformat}
/Thr/The/
{noformat}
+It has since been extended to collet metrics and other values, and has proven
{noformat}
/collet/collect/
{noformat}
+The short term solution of a dynamic wrapper committer could postpone the need
for this.
{noformat}
Section sounds like it needs updating.
{noformat}
+The S3A client can ba configured to rety those operations which are considered
{noformat}
/rety/retry/
{noformat}
+paths can interfere with each other
{noformat}
add period.
{noformat}
+ skipDuringFaultInjection(fs);
{noformat}
Nice.
{noformat}
+log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
{noformat}
Seems a little heavy.
> Add S3Guard committer for zero-rename commits to S3 endpoints
> -------------------------------------------------------------
>
> Key: HADOOP-13786
> URL: https://issues.apache.org/jira/browse/HADOOP-13786
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs/s3
> Affects Versions: 3.0.0-beta1
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13786-036.patch, HADOOP-13786-037.patch,
> HADOOP-13786-038.patch, HADOOP-13786-039.patch,
> HADOOP-13786-HADOOP-13345-001.patch, HADOOP-13786-HADOOP-13345-002.patch,
> HADOOP-13786-HADOOP-13345-003.patch, HADOOP-13786-HADOOP-13345-004.patch,
> HADOOP-13786-HADOOP-13345-005.patch, HADOOP-13786-HADOOP-13345-006.patch,
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-007.patch,
> HADOOP-13786-HADOOP-13345-009.patch, HADOOP-13786-HADOOP-13345-010.patch,
> HADOOP-13786-HADOOP-13345-011.patch, HADOOP-13786-HADOOP-13345-012.patch,
> HADOOP-13786-HADOOP-13345-013.patch, HADOOP-13786-HADOOP-13345-015.patch,
> HADOOP-13786-HADOOP-13345-016.patch, HADOOP-13786-HADOOP-13345-017.patch,
> HADOOP-13786-HADOOP-13345-018.patch, HADOOP-13786-HADOOP-13345-019.patch,
> HADOOP-13786-HADOOP-13345-020.patch, HADOOP-13786-HADOOP-13345-021.patch,
> HADOOP-13786-HADOOP-13345-022.patch, HADOOP-13786-HADOOP-13345-023.patch,
> HADOOP-13786-HADOOP-13345-024.patch, HADOOP-13786-HADOOP-13345-025.patch,
> HADOOP-13786-HADOOP-13345-026.patch, HADOOP-13786-HADOOP-13345-027.patch,
> HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-028.patch,
> HADOOP-13786-HADOOP-13345-029.patch, HADOOP-13786-HADOOP-13345-030.patch,
> HADOOP-13786-HADOOP-13345-031.patch, HADOOP-13786-HADOOP-13345-032.patch,
> HADOOP-13786-HADOOP-13345-033.patch, HADOOP-13786-HADOOP-13345-035.patch,
> cloud-intergration-test-failure.log, objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the
> presence of failures". Implement it, including whatever is needed to
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard
> provides a consistent view of the presence/absence of blobs, show that we can
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output
> streams (ie. not visible until the close()), if we need to use that to allow
> us to abort commit operations.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]