[ 
https://issues.apache.org/jira/browse/HADOOP-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551616#comment-17551616
 ] 

Sam Kramer edited comment on HADOOP-18278 at 6/8/22 1:18 PM:
-------------------------------------------------------------

Hey [[email protected]] ,

Thank you for your detailed comment, this is exactly what we need :) 

I gave a quick glance over your PR, and from my perspective this will do 
exactly what we need. Our workflow is a write-once, delete-once, read-many 
times, so we're able to make very strong assumptions (i.e. we do not need to 
check if file already exists, or if it's a directory). 

 

Edit – We also know what _is_ a directory, and what _isn't_ ahead of time.

 

Any ideas on when you think this may be committed / released? 

Cheers,

Sam


was (Author: JIRAUSER290621):
Hey [[email protected]] ,

Thank you for your detailed comment, this is exactly what we need :) 

I gave a quick glance over your PR, and from my perspective this will do 
exactly what we need. Our workflow is a write-once, delete-once, read-many 
times, so we're able to make very strong assumptions (i.e. we do not need to 
check if file already exists, or if it's a directory). 

Any ideas on when you think this may be committed / released? 

Cheers,

Sam

> Do not perform a LIST call when creating a file
> -----------------------------------------------
>
>                 Key: HADOOP-18278
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18278
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.3
>            Reporter: Sam Kramer
>            Priority: Minor
>
> Hello,
> We've noticed that when creating a file, which does not exist in S3, we see 
> an extra LIST call gets issued to see if it's a directory (i.e. if key = 
> "bar", it will issue an object list request for "bar/"). 
> Is this really necessary, shouldn't a HEAD request be sufficient to determine 
> if it actually exists or not? As we're creating 1000s of files, this is quite 
> expensive, as we're effectively doubling our costs for file creation. Curious 
> if others have experienced similar or identical issues, or if there are any 
> workarounds. 
> [https://github.com/apache/hadoop/blob/516a2a8e440378c868ddb02cb3ad14d0d879037f/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L3359-L3369]
>  
> Thanks,
> Sam



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to