steveloughran commented on code in PR #3289:
URL: https://github.com/apache/hadoop/pull/3289#discussion_r892330907


##########
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdataoutputstreambuilder.md:
##########
@@ -182,3 +182,50 @@ see `FileSystem#create(path, ...)` and 
`FileSystem#append()`.
     result = FSDataOutputStream
 
 The result is `FSDataOutputStream` to be used to write data to filesystem.
+
+
+## <a name="s3a"></a> S3A-specific options
+
+Here are the custom options which the S3A Connector supports.
+
+| Name                        | Type      | Meaning                            
    |
+|-----------------------------|-----------|----------------------------------------|
+| `fs.s3a.create.performance` | `boolean` | create a file with maximum 
performance |
+| `fs.s3a.create.header` | `string` | prefix for user supplied headers |
+
+### `fs.s3a.create.performance`
+
+Prioritize file creation performance over safety checks for filesystem 
consistency.
+
+This
+1. Skips the `LIST` call which makes sure a file is being created over a 
directory.
+   Risk: a file is created over a directory.
+1. Ignores the overwrite flag.
+1. Never issues a `DELETE` call to delete parent directory markers.
+   Risk: if directory markers have not already been leaned up, such as when 
creating the

Review Comment:
   things like rename and delete on the file may not find the subdirectory. 
   
https://github.com/steveloughran/hadoop/blob/s3/HADOOP-17833-magic-committer-performance/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestCreateFileCost.java#L213
   
   nothing is lost, its just that things like treewalking may not find them. 
This is what that LIST is designed to prevent. Here you say "I know what I am 
doing" and if you didn't, well, too bad. the price of performance is safety, as 
BMW and porsche like to point out.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to