karuppayya commented on pull request #1471:
URL: https://github.com/apache/iceberg/pull/1471#issuecomment-947224930


   In case of `S3FileIO`, default scheme is `s3://`.
   Writes happening from different clients, will have schemes based on 
`io-impl` property. The manifest might have mix of `s3://`, `s3a://` etc
   But the file listing(in DeleteOrphanFiles), will have only only a single 
prefix(which is determined by the Client Hadoop configuration). This will 
result in orphan files not being cleaned.
   When the user is aware that the scheme can be ignored, I think we should 
provide a configuration to do that.
   
   I am not able to come up with a concrete case for the authority(may be HDFS 
with and with authority), but that could also be a configuration.
   
   @RussellSpitzer @aokolnychyi @rdblue @flyrain @raptond Your thoughts on this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to