kevinjqliu opened a new issue, #732: URL: https://github.com/apache/iceberg-cpp/issues/732
`StrictMetricsEvaluator::Evaluate` currently returns `ROWS_MUST_MATCH` when `data_file.record_count <= 0`: https://github.com/apache/iceberg-cpp/blob/c0c6b01393070b0813f49b0d6220c98256379cef/src/iceberg/expression/strict_metrics_evaluator.cc#L534-L537 This is correct for `record_count == 0`, but not for `record_count == -1`. iceberg-cpp uses `-1` as an unknown row-count sentinel when writer metrics do not include `row_count`: - data writer: https://github.com/apache/iceberg-cpp/blob/c0c6b01393070b0813f49b0d6220c98256379cef/src/iceberg/data/data_writer.cc#L80-L86 - equality delete writer: https://github.com/apache/iceberg-cpp/blob/c0c6b01393070b0813f49b0d6220c98256379cef/src/iceberg/data/equality_delete_writer.cc#L80-L86 - position delete writer: https://github.com/apache/iceberg-cpp/blob/c0c6b01393070b0813f49b0d6220c98256379cef/src/iceberg/data/position_delete_writer.cc#L140-L146 Other code already treats negative record count as unknown/missing: - inclusive metrics: https://github.com/apache/iceberg-cpp/blob/c0c6b01393070b0813f49b0d6220c98256379cef/src/iceberg/expression/inclusive_metrics_evaluator.cc#L507-L516 - count aggregate: https://github.com/apache/iceberg-cpp/blob/c0c6b01393070b0813f49b0d6220c98256379cef/src/iceberg/expression/aggregate.cc#L379-L388 Impact: a file with unknown row count can be treated as if every row must match, even for predicates like `AlwaysFalse`. Suggested fix: only special-case `record_count == 0`; let negative/unknown counts fall through to normal strict metrics evaluation. A focused regression test with `record_count = -1` and `AlwaysFalse` fails before changing `<= 0` to `== 0`, and passes after. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
