szehon-ho commented on issue #4346: URL: https://github.com/apache/iceberg/issues/4346#issuecomment-1076850852
Yea actually the idea is same as your original, just that: - rename the flag - add the error mode (for information to user that they were going to delete valid files) -combine the authority/scheme into 'prefix' Maybe ```delete``` => ```force-delete``` to make it clear they should not do this unless they are sure. For the algorithm, yea that works. If we want to further optimize we could even conditionally just skip the prefix-less comparison for non-error mode, like in original algorithm. I'm open if error mode proves too cumbersome to be useful. Initialy I was thinking its a safety, that they must turn off if trying to delete on a different absolute location than what the files were written with. Users could go back to running RemoveOrphan with default 'error' mode once table is fixed with all locations pointing to new prefix. Maybe it can be via RepairManifests with an option to rewrite the prefix, or once relative path is there we can change the root location. Yea, I didn't initially think to distinguish scheme/authority, and just though prefix as different if either are different. When bucket is different, we should throw exception right? (user tries to clean a different bucket than the one the table initially wrote to). Though I can see for the other case , if the scheme is different (s3 => s3a), it's debatable. I was thinking to avoid too many details in the config, just have them set prefix-mismatch-mode='ignore' in this case, but we could put another flag if we really need. Again, the user could fix s3 to s3a in the paths, using some of these to-be-developed features. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
