bodduv commented on PR #14500: URL: https://github.com/apache/iceberg/pull/14500#issuecomment-4612756568
Prior to PR getting closed, the open issue was around how to handle `StrictMetricsEvaluator` for UUID comparisons. The uses of `StrictMetricsEvaluator` can be seen as an optimization, where data file metadata is used to make decisions where possible to avoid checking each row of the file. `StrictMetricsEvaluator` returns `ROWS_MIGHT_NOT_MATCH`, i.e., the current intended optimization could not be applied safely to the entire file without reading each of its rows. In the recent commits post PR open, `StrictMetricsEvaluator` return `ROWS_MIGHT_NOT_MATCH` for UUID comparison (apart from `isNull` and `notNull` expressions) as its callers treat true as proof that every row in a file matches the predicate. Returning false keeps ManifestFilterManager, overwrite validation, and Spark metadata-delete planning conservative: they avoid whole-file deletion, reject validation, or avoid metadata-only delete (so as to not rely on potentially non-RFC compliant UUID bounds). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
