Fokko commented on code in PR #13398:
URL: https://github.com/apache/iceberg/pull/13398#discussion_r2180800745
##########
api/src/main/java/org/apache/iceberg/expressions/StrictMetricsEvaluator.java:
##########
@@ -69,13 +71,26 @@ public StrictMetricsEvaluator(Schema schema, Expression
unbound, boolean caseSen
* otherwise.
*/
public boolean eval(ContentFile<?> file) {
- // TODO: detect the case where a column is missing from the file using
file's max field id.
+ if (file.valueCounts() != null) {
+ int maxFieldId = file.valueCounts().keySet().stream().mapToInt(i ->
i).max().orElse(0);
Review Comment:
Thanks for taking the suggestion into consideration. Since you already have
the schema, you could also build a set of the field IDs to check if the column
is missing.
Keep in mind that not all the Parquet files have the field IDs set. If you
convert an existing Hive table into an Iceberg table, the Iceberg will leverage
[name-mapping](https://iceberg.apache.org/spec/#column-projection) to map the
names into an Iceberg field-ID.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]