rymurr commented on pull request #1843: URL: https://github.com/apache/iceberg/pull/1843#issuecomment-737921384
Hey @jackye1995 thanks for the comments! The first approach in this patch was trying to tackle the problem with a new catalog. You can see it [here](https://github.com/apache/iceberg/pull/1843/commits/9cf2bede01a9194e730f78888914bd9c6570aee3). On the plus side it keeps all the filesystem handling in 1 class but it does have the downside of creating a lot of duplicate code and wasn't super nice. @RussellSpitzer suggested the approach in this PR and I think I prefer it now too, it keeps the scope if this change very small too. What I am struggling w/ at the moment is: * how to robustly identify a path in an `Identifier` (eg not rely solely on the presence of `/`) * how to correctly handle edge cases like ```catalog.namespace.`file://path/to/table` ```. What does a namespace mean in this case and how do we represent a path as an Identifier or TableIdentifier. For the second point I think we should reject any namespace elements and treat a path as a namespaceless table with the full table path as the table/identifier name. For the first part I think the only reliable way is to try to do IO with the path: can we get an FS from the path, is it the local fs, is it a relative path etc. Thoughts? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
