anuragmantri commented on issue #4346: URL: https://github.com/apache/iceberg/issues/4346#issuecomment-1070352022
Thanks for reviving this issue @aokolnychyi. I've been thinking if having relative paths will help in this case. In my [relative paths change](https://github.com/apache/iceberg/pull/2658), I have separated out table location into a prefix and a location. For example ``` { "format-version" : 2, "table-uuid" : "24f86bb2-3473-4352-b8fb-375a55b0267b", "location" : "tbl", "location-prefix" : "file:/var/folders/wr/q_40znsn3_b0n0hx08v15dzr0000gn/T/hive5023759931126715387/hivedb.db/", "last-sequence-number" : 2, "last-updated-ms" : 1647492090039, "last-column-id" : 1, "current-schema-id" : 0, ... } ``` Similarly, location is also split into prefix and location in the catalog with the ability to switch prefix with an API. Would this change help in the problem described here if we always pass a relative path in `location => '...' ` in the `remove_orphan_files()` call and let the scheme and authority come from `location-prefix` property above? In the meantime, having `ignore-*` makes sense to me. I also agree with @rdblue that we need to find consensus on which defaults apply to which schemes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
