steveloughran commented on code in PR #3289:
URL: https://github.com/apache/hadoop/pull/3289#discussion_r892330907
##########
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdataoutputstreambuilder.md:
##########
@@ -182,3 +182,50 @@ see `FileSystem#create(path, ...)` and
`FileSystem#append()`.
result = FSDataOutputStream
The result is `FSDataOutputStream` to be used to write data to filesystem.
+
+
+## <a name="s3a"></a> S3A-specific options
+
+Here are the custom options which the S3A Connector supports.
+
+| Name | Type | Meaning
|
+|-----------------------------|-----------|----------------------------------------|
+| `fs.s3a.create.performance` | `boolean` | create a file with maximum
performance |
+| `fs.s3a.create.header` | `string` | prefix for user supplied headers |
+
+### `fs.s3a.create.performance`
+
+Prioritize file creation performance over safety checks for filesystem
consistency.
+
+This
+1. Skips the `LIST` call which makes sure a file is being created over a
directory.
+ Risk: a file is created over a directory.
+1. Ignores the overwrite flag.
+1. Never issues a `DELETE` call to delete parent directory markers.
+ Risk: if directory markers have not already been leaned up, such as when
creating the
Review Comment:
things like rename and delete on the file may not find the subdirectory.
https://github.com/steveloughran/hadoop/blob/s3/HADOOP-17833-magic-committer-performance/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestCreateFileCost.java#L213
nothing is lost, its just that things like treewalking may not find them.
This is what that LIST is designed to prevent. Here you say "I know what I am
doing" and if you didn't, well, too bad. the price of performance is safety, as
BMW and porsche like to point out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]