[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833617#comment-17833617 ]
ASF GitHub Bot commented on HADOOP-19072: ----------------------------------------- steveloughran commented on code in PR #6543: URL: https://github.com/apache/hadoop/pull/6543#discussion_r1549880873 ########## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MkdirOperation.java: ########## @@ -124,7 +132,32 @@ public Boolean execute() throws IOException { return true; } - // Walk path to root, ensuring closest ancestor is a directory, not file + // if performance creation mode is set, no need to check Review Comment: i was wrong; now the patch is in I see where I was mistaken. Its the versioned buckets where problems surface. sorry! ########## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MkdirOperation.java: ########## @@ -98,14 +107,13 @@ public Boolean execute() throws IOException { } // get the file status of the path. - // this is done even for a magic path, to avoid always issuing PUT - // requests. Doing that without a check wouild seem to be an - // optimization, but it is not because - // 1. PUT is slower than HEAD - // 2. Write capacity is less than read capacity on a shard - // 3. It adds needless entries in versioned buckets, slowing - // down subsequent operations. - FileStatus fileStatus = getPathStatusExpectingDir(dir); + // this is not done for magic path i.e. performanceCreation mode. + // For performanceCreation mode, we would probe for HEAD only. + // For non-performance or regular mode, the probe for both HEAD and LIST would + // be done. + S3AFileStatus fileStatus = performanceCreation + ? probePathStatusOrNull(dir, StatusProbeEnum.HEAD_ONLY) Review Comment: this will trigger a needless PUT if there isn't a marker, just children, which hits all the problems in the comments of the existing code. afrid we need to revert to the old code. > S3A: expand optimisations on stores with "fs.s3a.create.performance" > -------------------------------------------------------------------- > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.0 > Reporter: Steve Loughran > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org