[ https://issues.apache.org/jira/browse/HADOOP-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17613977#comment-17613977 ]
ASF GitHub Bot commented on HADOOP-18304: ----------------------------------------- dannycjones commented on code in PR #4478: URL: https://github.com/apache/hadoop/pull/4478#discussion_r989873742 ########## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md: ########## @@ -180,20 +180,20 @@ and restarting the job. whose output is in the job attempt directory, *and only rerunning all uncommitted tasks*. -This algorithm does not works safely or swiftly with AWS S3 storage because -tenames go from being fast, atomic operations to slow operations which can fail partway through. +This algorithm does not work safely or swiftly with AWS S3 storage because +renames go from being fast, atomic operations to slow operations which can fail partway through. This then is the problem which the S3A committers address: -*How to safely and reliably commit work to Amazon S3 or compatible object store* +>*How to safely and reliably commit work to Amazon S3 or compatible object store.* Review Comment: Fair point, it is not quoting anyone (or maybe Steve). I will let it join the previous sentence as I think that makes more sense. ########## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md: ########## @@ -356,19 +351,20 @@ task commit. However, it has extra requirements of the filesystem -1. [Obsolete] It requires a consistent object store. +1. The object store must be consistent. 1. The S3A client must be configured to recognize interactions -with the magic directories and treat them specially. +with the magic directories and treat them as a special case. -Now that Amazon S3 is consistent, the magic committer is enabled by default. +Now that [Amazon S3 is consistent](https://aws.amazon.com/s3/consistency/), +the magic committer is enabled by default. Review Comment: Yes, I believe that's correct. I believe what we're saying here is that S3A's "magic path rewriting" where it only stages the writes to `__magic/` directories is now enabled by default. I will update this to be clearer, something like: ```suggestion the magic directory path rewriting is enabled by default. ``` > Improve S3A committers documentation clarity > -------------------------------------------- > > Key: HADOOP-18304 > URL: https://issues.apache.org/jira/browse/HADOOP-18304 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation > Reporter: Daniel Carl Jones > Assignee: Daniel Carl Jones > Priority: Trivial > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > I recently was learning more about the S3A committers. I'm hoping to provide > some improvements as someone who has recently read [this > documentation|https://github.com/apache/hadoop/blob/1f157f802d2d6142d21482eaa86baf1bef458ed4/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md#L495] > without fully understanding prior. > For instance, referencing different components more explicitly and adding > pre-requisite info. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org