vaultah opened a new issue, #15502:
URL: https://github.com/apache/iceberg/issues/15502
### Apache Iceberg version
1.10.1 (latest release)
### Query engine
None
### Please describe the bug 🐞
When an Iceberg table has manifests written using a partition spec with
`identity`-transformed timestamp field, queries that filter on that field using
a temporal transform like `hours` (or even `bucket`) fail with
```
ValidationException: Invalid value for conversion to type timestamptz
```
Example verifiable in Iceberg 1.9.2 and 1.10.1:
```
spark-sql ()> CREATE TABLE iceberg.default.identity_projection_bug (
> id BIGINT,
> ts TIMESTAMP
> ) USING iceberg
> PARTITIONED BY (ts);
Time taken: 1.287 seconds
spark-sql ()> INSERT INTO iceberg.default.identity_projection_bug
> VALUES (1, TIMESTAMP '2026-01-06 18:00:00');
Time taken: 2.469 seconds
spark-sql ()> SELECT * FROM iceberg.default.identity_projection_bug
> WHERE system.hours(ts) = 490674;
26/03/03 13:43:40 WARN SparkScanBuilder: Failed to check if hours(ts) =
490674 can be pushed down: Invalid value for conversion to type timestamptz:
490674 (java.lang.Integer)
26/03/03 13:43:40 ERROR SparkSQLDriver: Failed in [SELECT * FROM
iceberg.default.identity_projection_bug
WHERE system.hours(ts) = 490674]
org.apache.iceberg.exceptions.ValidationException: Invalid value for
conversion to type timestamptz: 490674 (java.lang.Integer)
at
org.apache.iceberg.expressions.UnboundPredicate.bindLiteralOperation(UnboundPredicate.java:176)
at
org.apache.iceberg.expressions.UnboundPredicate.bind(UnboundPredicate.java:122)
at
org.apache.iceberg.expressions.Binder$BindVisitor.predicate(Binder.java:159)
at
org.apache.iceberg.expressions.Binder$BindVisitor.predicate(Binder.java:118)
at
org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:347)
at org.apache.iceberg.expressions.Binder.bind(Binder.java:60)
at
org.apache.iceberg.expressions.ManifestEvaluator.<init>(ManifestEvaluator.java:67)
at
org.apache.iceberg.expressions.ManifestEvaluator.forPartitionFilter(ManifestEvaluator.java:63)
at
org.apache.iceberg.BaseDistributedDataScan.newManifestEvaluator(BaseDistributedDataScan.java:386)
at
org.apache.iceberg.BaseDistributedDataScan.lambda$specCache$6(BaseDistributedDataScan.java:395)
at
org.apache.iceberg.relocated.com.google.common.collect.SingletonImmutableBiMap.forEach(SingletonImmutableBiMap.java:69)
at
org.apache.iceberg.BaseDistributedDataScan.specCache(BaseDistributedDataScan.java:395)
at
org.apache.iceberg.BaseDistributedDataScan.filterManifests(BaseDistributedDataScan.java:219)
at
org.apache.iceberg.BaseDistributedDataScan.findMatchingDeleteManifests(BaseDistributedDataScan.java:211)
at
org.apache.iceberg.BaseDistributedDataScan.doPlanFiles(BaseDistributedDataScan.java:149)
at org.apache.iceberg.SnapshotScan.planFiles(SnapshotScan.java:139)
...
Invalid value for conversion to type timestamptz: 490674 (java.lang.Integer)
org.apache.iceberg.exceptions.ValidationException: Invalid value for
conversion to type timestamptz: 490674 (java.lang.Integer)
...
spark-sql ()>
```
`BaseDistributedDataScan.newManifestEvaluator` calls
`ManifestEvaluator.forPartitionFilter` with
`Projections.inclusive(...).project(...)`, which uses `Transform.project`,
which is [supposed to create an inclusive
predicate](https://github.com/apache/iceberg/blob/39ed7e445361b2ff83887795e7e8d9f92ed45abc/api/src/main/java/org/apache/iceberg/transforms/Transform.java#L104-L115).
However, for `Identity` transform [`project` method simply calls
`projectStrict`](https://github.com/apache/iceberg/blob/39ed7e445361b2ff83887795e7e8d9f92ed45abc/api/src/main/java/org/apache/iceberg/transforms/Identity.java#L142-L145),
which in the example above tries to bind the integer value of the temporal
transform 490674 as a filter for timestamp field `ts`.
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]