singhpk234 commented on issue #4839: URL: https://github.com/apache/iceberg/issues/4839#issuecomment-1134554058
> There's no differentiation between which FileIO implemenentation should be used for data and metadata files. Agree with you, As per my understanding, wouldn't this differentiation be automatically be done by the `location` produced for meta-data & data path as : meta-data path would be : `file://mnt/meta-data` data path would be : `s3://<bucket>/prefix` (this can be controled via conf's such as `write.metadata.path`). Now a user when uses ResolvingFileIO it will implicitly use `HadoopFileIO` for meta-data (i.e paths like `file://mnt/meta-data`), S3FileIO for data (i.e paths like `s3://<bucket>/prefix`). > I think the idea behind this ticket is that users would have the capability to explicitly specify which FileIO implementation should be used for data files and which one for metadata files. This sounds like an interesting use case as to why would we want two diff FileIO's when let's say both data / meta-data share the same schemes. One case comes to mind that we want to seperate tuned SDK for S3FileIO obj which would read heavy data files and other has light weight meta-data files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
