[
https://issues.apache.org/jira/browse/HADOOP-18304?focusedWorklogId=784138&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784138
]
ASF GitHub Bot logged work on HADOOP-18304:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 23/Jun/22 11:11
Start Date: 23/Jun/22 11:11
Worklog Time Spent: 10m
Work Description: dannycjones commented on code in PR #4478:
URL: https://github.com/apache/hadoop/pull/4478#discussion_r904893346
##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md:
##########
@@ -492,18 +484,19 @@ was written. With the policy of `append`, the new file
would be added to
the existing set of files.
-### Notes
+### Notes on using Staging Committers
1. A deep partition tree can itself be a performance problem in S3 and the s3a
client,
-or, more specifically. a problem with applications which use recursive
directory tree
+or more specifically a problem with applications which use recursive directory
tree
walks to work with data.
1. The outcome if you have more than one job trying simultaneously to write
data
to the same destination with any policy other than "append" is undefined.
1. In the `append` operation, there is no check for conflict with file names.
-If, in the example above, the file `log-20170228.avro` already existed,
-it would be overridden. Set `fs.s3a.committer.staging.unique-filenames` to
`true`
+If the file `log-20170228.avro` in the example above already existed, it would
be overwritten.
+
+ Set `fs.s3a.committer.staging.unique-filenames` to `true`
Review Comment:
Using the indentation like this I believe allows you to put the sentence on
a new line but still part of the previous point.
That being said, it is not obvious from the markdown and I cannot test the
output HTML so I'll revert.
Issue Time Tracking
-------------------
Worklog Id: (was: 784138)
Time Spent: 1.5h (was: 1h 20m)
> Improve S3A committers documentation clarity
> --------------------------------------------
>
> Key: HADOOP-18304
> URL: https://issues.apache.org/jira/browse/HADOOP-18304
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: documentation
> Reporter: Daniel Carl Jones
> Assignee: Daniel Carl Jones
> Priority: Trivial
> Labels: pull-request-available
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> I recently was learning more about the S3A committers. I'm hoping to provide
> some improvements as someone who has recently read [this
> documentation|https://github.com/apache/hadoop/blob/1f157f802d2d6142d21482eaa86baf1bef458ed4/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md#L495]
> without fully understanding prior.
> For instance, referencing different components more explicitly and adding
> pre-requisite info.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]