wombatu-kun opened a new pull request, #16609: URL: https://github.com/apache/iceberg/pull/16609
## What Pushes Iceberg `timestamp_ns` and `timestamptz_ns` predicates down into the ORC reader. Before this change, any filter on a nanosecond-timestamp column threw `UnsupportedOperationException: Type timestamp_ns not supported in ORC SearchArguments` and failed the read. ## Why `ExpressionToSearchArgument` maps Iceberg types to an ORC `PredicateLeaf.Type` in `type()` and to predicate values in `literal()`, but `TIMESTAMP_NANO` was handled in neither switch, and it was also missing from `UNSUPPORTED_TYPES` (the set of types that degrade gracefully to `YES_NO_NULL`). So a nanosecond-timestamp predicate fell through to the `default:` branch and threw, crashing any filtered read of such a column. This affects both `timestamp_ns` and `timestamptz_ns` (both have type id `TIMESTAMP_NANO`) and every predicate kind, since they all call `type()`. ORC 1.9.8 represents timestamp predicates with `java.sql.Timestamp`, which carries full nanosecond precision, and evaluates row-level filters at that precision, so the predicate can be pushed down exactly rather than skipped. ## Changes `ExpressionToSearchArgument` now maps `TIMESTAMP_NANO` to `PredicateLeaf.Type.TIMESTAMP` and converts the nanos-from-epoch literal to a `java.sql.Timestamp`, mirroring the existing micros `TIMESTAMP` handling. ## Tests - `TestExpressionToSearchArgument#testTimestampNanoTypes` asserts the converted `SearchArgument` for both `timestamp_ns` and `timestamptz_ns`. - `TestOrcDataReader#testTimestampNanoFilterPushdownRespectsNanoseconds` writes rows that differ only by sub-microsecond nanoseconds and verifies that row-level SARG filtering returns exactly the rows past a sub-microsecond boundary, proving nanosecond precision is honored rather than truncated to micros. - `TestOrcDataReader#testTimestampTzNanoFilterAcrossTimezones` writes `timestamptz_ns` values in several different zone offsets and filters with a boundary expressed in yet another zone, verifying the comparison is by instant at nanosecond precision. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
