[
https://issues.apache.org/jira/browse/HADOOP-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17613977#comment-17613977
]
ASF GitHub Bot commented on HADOOP-18304:
-----------------------------------------
dannycjones commented on code in PR #4478:
URL: https://github.com/apache/hadoop/pull/4478#discussion_r989873742
##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md:
##########
@@ -180,20 +180,20 @@ and restarting the job.
whose output is in the job attempt directory, *and only rerunning all
uncommitted tasks*.
-This algorithm does not works safely or swiftly with AWS S3 storage because
-tenames go from being fast, atomic operations to slow operations which can
fail partway through.
+This algorithm does not work safely or swiftly with AWS S3 storage because
+renames go from being fast, atomic operations to slow operations which can
fail partway through.
This then is the problem which the S3A committers address:
-*How to safely and reliably commit work to Amazon S3 or compatible object
store*
+>*How to safely and reliably commit work to Amazon S3 or compatible object
store.*
Review Comment:
Fair point, it is not quoting anyone (or maybe Steve). I will let it join
the previous sentence as I think that makes more sense.
##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md:
##########
@@ -356,19 +351,20 @@ task commit.
However, it has extra requirements of the filesystem
-1. [Obsolete] It requires a consistent object store.
+1. The object store must be consistent.
1. The S3A client must be configured to recognize interactions
-with the magic directories and treat them specially.
+with the magic directories and treat them as a special case.
-Now that Amazon S3 is consistent, the magic committer is enabled by default.
+Now that [Amazon S3 is consistent](https://aws.amazon.com/s3/consistency/),
+the magic committer is enabled by default.
Review Comment:
Yes, I believe that's correct.
I believe what we're saying here is that S3A's "magic path rewriting" where
it only stages the writes to `__magic/` directories is now enabled by default.
I will update this to be clearer, something like:
```suggestion
the magic directory path rewriting is enabled by default.
```
> Improve S3A committers documentation clarity
> --------------------------------------------
>
> Key: HADOOP-18304
> URL: https://issues.apache.org/jira/browse/HADOOP-18304
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: documentation
> Reporter: Daniel Carl Jones
> Assignee: Daniel Carl Jones
> Priority: Trivial
> Labels: pull-request-available
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> I recently was learning more about the S3A committers. I'm hoping to provide
> some improvements as someone who has recently read [this
> documentation|https://github.com/apache/hadoop/blob/1f157f802d2d6142d21482eaa86baf1bef458ed4/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md#L495]
> without fully understanding prior.
> For instance, referencing different components more explicitly and adding
> pre-requisite info.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]