jordepic commented on PR #14501: URL: https://github.com/apache/iceberg/pull/14501#issuecomment-3549221656
> It could be potentially dangerous to allow users to configure this on a per-table basis because cleanup may not be configured, which may result in data that should be deleted, persisting in the file system. There's also nothing that appears to prevent the configuration from being applied to other file-system implementations (like S3A), which would be bad (data copy, no cleanup), but I feel like we should discourage that. @jordepic Is there anything we can do to prevent this? When you call `trash.isEnabled()`, it checks whether the TrashPolicy.isEnabled(), and in the TrashPolicyDefault, isEnabled() ensures that the deletion interval is > 0. So I think this may be a non issue. If people override their trash class to be something else, it could be an issue. For reference: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Trash.java#L62 https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicy.java#L142 https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java#L126 https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Trash.java#L130 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
