vaultah opened a new issue, #15502:
URL: https://github.com/apache/iceberg/issues/15502

   ### Apache Iceberg version
   
   1.10.1 (latest release)
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   When an Iceberg table has manifests written using a partition spec with 
`identity`-transformed timestamp field, queries that filter on that field using 
a temporal transform like `hours` (or even `bucket`) fail with
   
   ```
   ValidationException: Invalid value for conversion to type timestamptz
   ```
   
   Example verifiable in Iceberg 1.9.2 and 1.10.1:
   
   ```
   spark-sql ()> CREATE TABLE iceberg.default.identity_projection_bug (
               >     id BIGINT,
               >     ts TIMESTAMP
               > ) USING iceberg
               > PARTITIONED BY (ts);
   Time taken: 1.287 seconds
   spark-sql ()> INSERT INTO iceberg.default.identity_projection_bug
               > VALUES (1, TIMESTAMP '2026-01-06 18:00:00');
   Time taken: 2.469 seconds
   spark-sql ()> SELECT * FROM iceberg.default.identity_projection_bug
               > WHERE system.hours(ts) = 490674;
   26/03/03 13:43:40 WARN SparkScanBuilder: Failed to check if hours(ts) = 
490674 can be pushed down: Invalid value for conversion to type timestamptz: 
490674 (java.lang.Integer)
   26/03/03 13:43:40 ERROR SparkSQLDriver: Failed in [SELECT * FROM 
iceberg.default.identity_projection_bug
   WHERE system.hours(ts) = 490674]
   org.apache.iceberg.exceptions.ValidationException: Invalid value for 
conversion to type timestamptz: 490674 (java.lang.Integer)
       at 
org.apache.iceberg.expressions.UnboundPredicate.bindLiteralOperation(UnboundPredicate.java:176)
       at 
org.apache.iceberg.expressions.UnboundPredicate.bind(UnboundPredicate.java:122)
       at 
org.apache.iceberg.expressions.Binder$BindVisitor.predicate(Binder.java:159)
       at 
org.apache.iceberg.expressions.Binder$BindVisitor.predicate(Binder.java:118)
       at 
org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:347)
       at org.apache.iceberg.expressions.Binder.bind(Binder.java:60)
       at 
org.apache.iceberg.expressions.ManifestEvaluator.<init>(ManifestEvaluator.java:67)
       at 
org.apache.iceberg.expressions.ManifestEvaluator.forPartitionFilter(ManifestEvaluator.java:63)
       at 
org.apache.iceberg.BaseDistributedDataScan.newManifestEvaluator(BaseDistributedDataScan.java:386)
       at 
org.apache.iceberg.BaseDistributedDataScan.lambda$specCache$6(BaseDistributedDataScan.java:395)
       at 
org.apache.iceberg.relocated.com.google.common.collect.SingletonImmutableBiMap.forEach(SingletonImmutableBiMap.java:69)
       at 
org.apache.iceberg.BaseDistributedDataScan.specCache(BaseDistributedDataScan.java:395)
       at 
org.apache.iceberg.BaseDistributedDataScan.filterManifests(BaseDistributedDataScan.java:219)
       at 
org.apache.iceberg.BaseDistributedDataScan.findMatchingDeleteManifests(BaseDistributedDataScan.java:211)
       at 
org.apache.iceberg.BaseDistributedDataScan.doPlanFiles(BaseDistributedDataScan.java:149)
       at org.apache.iceberg.SnapshotScan.planFiles(SnapshotScan.java:139)
       ...
   Invalid value for conversion to type timestamptz: 490674 (java.lang.Integer)
   org.apache.iceberg.exceptions.ValidationException: Invalid value for 
conversion to type timestamptz: 490674 (java.lang.Integer)
       ...
   spark-sql ()>
   ```
   
   `BaseDistributedDataScan.newManifestEvaluator` calls 
`ManifestEvaluator.forPartitionFilter` with 
`Projections.inclusive(...).project(...)`, which uses `Transform.project`, 
which is [supposed to create an inclusive 
predicate](https://github.com/apache/iceberg/blob/39ed7e445361b2ff83887795e7e8d9f92ed45abc/api/src/main/java/org/apache/iceberg/transforms/Transform.java#L104-L115).
 However, for `Identity` transform [`project`  method simply calls 
`projectStrict`](https://github.com/apache/iceberg/blob/39ed7e445361b2ff83887795e7e8d9f92ed45abc/api/src/main/java/org/apache/iceberg/transforms/Identity.java#L142-L145),
 which in the example above tries to bind the integer value of the temporal 
transform 490674 as a filter for timestamp field `ts`.
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to