[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to S3 endpoints

Aaron Fabbri (JIRA) Tue, 17 Oct 2017 20:15:29 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208754#comment-16208754
 ]


Aaron Fabbri commented on HADOOP-13786:
---------------------------------------

Feeling another wave coming on..

Documentation is awesome, good work.  Funny that the thorough treatment of 
output commit is living here.  We may want to link from MR site docs.

{noformat}
+or it is at the destination, -in which case the rename actually successed.
{noformat}
/successed/suceeded/.  I do like "successed" though, colloquially.

{noformat}
+1. Workers execute "Tasks", which are fractiona of the job, a job whose
{noformat}
/fractiona/fractions/

{noformat}
+This then is the problem which the S3Guard committers address: how to safely
...
+## Meet the S3Guard Commmitters
{noformat}
/S3Guard committers/S3A committers/ ?  If so search/replace other spots.

{noformat}
+attempts to write to specifci "magic" paths are interpreted as writes
{noformat}
/specifci/specific/

{noformat}
+committed, VMs using the committer do not need to so much storage
+But the committer needs a "consistent" S3 store (via S3Guard when working with
{noformat}
Capitalization / punctuation.

{noformat}
+does meanthat the subsequent readers of work *will* need a consistent listing
{noformat}
/meanthat/mean that/

{noformat}
+All of the S3A committers revert to provide a classic FileOutputCommitter 
instance
{noformat}
/provide/providing/

{noformat}
+factories to be used to create a committer, based on the specific value of 
theoption `fs.s3a.committer.name`:
{noformat}
/theoption/the option/

{noformat}
+This committer an extension of the "Directory" committer which has a special 
conflict resolution
{noformat}
/an/is an/

{noformat}
+The Jjb is configured to use the magic committer, but the S3A bucket has not 
been explicitly
{noformat}
/Jjb/job/

{noformat}
+This can be done for those buckets which are known to be consistent, either
+because the [S3Guard](s3guard.html) is used to provide consistency,
{noformat}
append: "or because the S3-compatible filesystem is known to be strongly 
consistent"

{noformat}
+protocol for the files listing the pending write operations. Tt uses
{noformat}
/Tt/It/

{noformat}
+One way to find out what is happening (i.e. get a stack trace of where the 
committer
+is being created is to set the option 
`mapreduce.fileoutputcommitter.algorithm.version`
+to a value such as "10".
{noformat}
Nice trick.  :-)

{noformat}
+There is an expectation that the Job Driver and tasks can communicate: if a 
task
+perform any operations itself during the task commit phase, it shall only do
{noformat}
/perform/performs/

{noformat}
+* For each attempt, and attempt ID is created, to build the job attempt ID.
{noformat}
/and attempt ID/an attempt ID/

{noformat}
+during task the task commit algorithm. When working with object stores which
{noformat}
remove first "task"

{noformat}
+ Historically then, it is the "v2 algorithm"
{noformat}
/v2 algorithm/second version of an algorithm/.. found it confusing on first 
read. Think this helps.

{noformat}
+Resource Manager with in `yarn.app.mapreduce.am.ob.committer.commit-window` 
milliseconds
+(default: 10,000). It does this by waiting until the next heartbeat it 
received.
+There's a s possible bug here: if the interval is set too small the thread may
{noformat}
/a s/a/
{noformat}
+permanently spin waiting a callback within the window. Ignoring that, this 
algorithm
{noformat}

Interesting stuff. Also, besides not being able to begin commit within the 
window, on non-realtime OS / storage there is no bound on time between 
beginning the commit and actually doing it, technically. 

{noformat}
+The AM may call the `OutputCommitter.taskAbort` on with a task attempt context,
{noformat}
/on with/?/

{noformat}
+1. There is a PUT operation, capable of uploading 5GB of data in one HTTP 
request.
+1. The PUT operation is atomic, but there is no PUT-no-overwrite option.
{noformat}
And multiple clients racing to write same path cannot detect which one actually 
succeeded via return code (IIRC this is determined by S3 service-side timestamp 
and not visible to clients). Am I right or mistaken here?

{noformat}
+metadata. Other implementations of the S3 protocol are fully consistent; the
{noformat}
clarity nit: /implementations of the S3 protocol/S3 protocol-compatible storage 
systems/

{noformat}
+The simplicity of the Stocator committer is somethign to appreciate.
{noformat}
/somthign/something/

{noformat}
+the server's local filesystem can add/update an entry the index table of the
{noformat}
 /can/and/

{noformat}
+Ryan Blue, of Netflix, has submitted an alternate committer, one which has a
{noformat}
/Ryan Blue/Ryan Blue, the Duke of Netflix and quite a fine gentleman whom we 
admire deeply/
(Optional change.)

{noformat}
+**Failure during task execution**
+
+All data is written to local temporary files; these need to be cleaned up.
+
+The job must ensure that the local (pending) data is purged. *TODO*: test this
{noformat}
TODO here.  Possible subtask for someone to help with once we have a feature 
branch?

{noformat}
+**Failure to communicate with S3 during data upload**
+
+If an upload fails, tasks will
+* attempt to abort PUT requests already uploaded to S3
{noformat}
Mention retry policy here?  dunno.

{noformat}
+1. It requires a path element, such as `__magic` which cannot be used
+for any purpose other than for the storage of pending commit data.
{noformat}
Aside: We could make this string configurable in the future, no?

{noformat}
+
+### Changes to `S3ABlockOutputStream`
+
+We can avoid having to copy and past the `S3ABlockOutputStream` by
{noformat}
/past/paste/
Also, this sounds outdated vs. the code.  May want to make a pass on some of 
the tentative-sounding stuff you wrote that has since been resolved.

{noformat}
+Update: There is a cost to this: MRv1 support is lost wihtout
{noformat}
typo at end of line.

{noformat}
+Thr committers can only be tested against an S3-compatible object store.
{noformat}
/Thr/The/

{noformat}
+It has since been extended to collet metrics and other values, and has proven
{noformat}
/collet/collect/

{noformat}
+The short term solution of a dynamic wrapper committer could postpone the need 
for this.
{noformat}
Section sounds like it needs updating.

{noformat}
+The S3A client can ba configured to rety those operations which are considered
{noformat}
/rety/retry/

{noformat}
+paths can interfere with each other
{noformat}
add period.

{noformat}
+    skipDuringFaultInjection(fs);
{noformat}
Nice. 

{noformat}
+log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
{noformat}
Seems a little heavy.



> Add S3Guard committer for zero-rename commits to S3 endpoints
> -------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-036.patch, HADOOP-13786-037.patch, 
> HADOOP-13786-038.patch, HADOOP-13786-039.patch, 
> HADOOP-13786-HADOOP-13345-001.patch, HADOOP-13786-HADOOP-13345-002.patch, 
> HADOOP-13786-HADOOP-13345-003.patch, HADOOP-13786-HADOOP-13345-004.patch, 
> HADOOP-13786-HADOOP-13345-005.patch, HADOOP-13786-HADOOP-13345-006.patch, 
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-007.patch, 
> HADOOP-13786-HADOOP-13345-009.patch, HADOOP-13786-HADOOP-13345-010.patch, 
> HADOOP-13786-HADOOP-13345-011.patch, HADOOP-13786-HADOOP-13345-012.patch, 
> HADOOP-13786-HADOOP-13345-013.patch, HADOOP-13786-HADOOP-13345-015.patch, 
> HADOOP-13786-HADOOP-13345-016.patch, HADOOP-13786-HADOOP-13345-017.patch, 
> HADOOP-13786-HADOOP-13345-018.patch, HADOOP-13786-HADOOP-13345-019.patch, 
> HADOOP-13786-HADOOP-13345-020.patch, HADOOP-13786-HADOOP-13345-021.patch, 
> HADOOP-13786-HADOOP-13345-022.patch, HADOOP-13786-HADOOP-13345-023.patch, 
> HADOOP-13786-HADOOP-13345-024.patch, HADOOP-13786-HADOOP-13345-025.patch, 
> HADOOP-13786-HADOOP-13345-026.patch, HADOOP-13786-HADOOP-13345-027.patch, 
> HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-028.patch, 
> HADOOP-13786-HADOOP-13345-029.patch, HADOOP-13786-HADOOP-13345-030.patch, 
> HADOOP-13786-HADOOP-13345-031.patch, HADOOP-13786-HADOOP-13345-032.patch, 
> HADOOP-13786-HADOOP-13345-033.patch, HADOOP-13786-HADOOP-13345-035.patch, 
> cloud-intergration-test-failure.log, objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to S3 endpoints

Reply via email to