tjdean opened a new issue #2253:
URL: https://github.com/apache/iceberg/issues/2253


   ## Issue
   `BaseParquetReaders` reads dates/times/timestamps into memory as 
`Temporals`, but their types map to `Numbers`. When residuals are evaluated on 
those types, the accessor tries to cast the temporal to a number and fails with 
an illegal state.
   
   
https://github.com/apache/iceberg/blob/6bf58e5037c22e4c6255c4bf01109fb45651ce6b/parquet/src/main/java/org/apache/iceberg/data/parquet/BaseParquetReaders.java#L326-L346
   
   
https://github.com/apache/iceberg/blob/6bf58e5037c22e4c6255c4bf01109fb45651ce6b/api/src/main/java/org/apache/iceberg/types/Type.java#L35-L37
   
   ## Example
   Note that I am using Hive 3 which, [according to the docs, is not 
supported](https://iceberg.apache.org/hive/#hive-read-support), but this error 
occurs outside of the Hive version specific object inspectors.
   ```
   0: jdbc:hive2://> SELECT d_date FROM TPCDS_PARQUET_2_ICE.DATE_DIM WHERE 
d_date = '2000-08-05';
   INFO  : Compiling 
command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e): 
SELECT d_date FROM TPCDS_PARQUET_2_ICE.DATE_DIM WHERE d_date = '2000-08-05'
   INFO  : No Stats for tpcds_parquet_2_ice@date_dim, Columns: d_date
   INFO  : Semantic Analysis Completed (retrial = false)
   INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:d_date, 
type:date, comment:null)], properties:null)
   INFO  : Completed compiling 
command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e); Time 
taken: 7.886 seconds
   INFO  : Executing 
command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e): 
SELECT d_date FROM TPCDS_PARQUET_2_ICE.DATE_DIM WHERE d_date = '2000-08-05'
   INFO  : Completed executing 
command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e); Time 
taken: 0.12 seconds
   INFO  : OK
   Error: java.io.IOException: java.lang.IllegalStateException: Not an instance 
of java.lang.Integer: 1900-01-02 (state=,code=0)
   ```
   
   ## Stack
   ```
   Caused by: java.io.IOException: java.lang.IllegalStateException: Not an 
instance of java.lang.Integer: 1900-01-02
           at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:640) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:547) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:880) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:243) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:471)
 ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           ... 13 more
   Caused by: java.lang.IllegalStateException: Not an instance of 
java.lang.Integer: 1900-01-02
           at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:57) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:44) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.BoundReference.eval(BoundReference.java:39) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.Evaluator$EvalVisitor.eq(Evaluator.java:121) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.Evaluator$EvalVisitor.eq(Evaluator.java:51) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.ExpressionVisitors$BoundVisitor.predicate(ExpressionVisitors.java:229)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.ExpressionVisitors.visitEvaluator(ExpressionVisitors.java:322)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.Evaluator$EvalVisitor.eval(Evaluator.java:56) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.expressions.Evaluator$EvalVisitor.access$100(Evaluator.java:51)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at org.apache.iceberg.expressions.Evaluator.eval(Evaluator.java:48) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.lambda$applyResidualFiltering$0(IcebergInputFormat.java:283)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.io.CloseableIterable$3.shouldKeep(CloseableIterable.java:82) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.io.FilterIterator.advance(FilterIterator.java:67) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.io.FilterIterator.hasNext(FilterIterator.java:50) 
~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextKeyValue(IcebergInputFormat.java:197)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.next(MapredIcebergInputFormat.java:104)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.next(MapredIcebergInputFormat.java:81)
 ~[iceberg-hive-runtime-0.10.0.jar:?]
           at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:607) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:547) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:880) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:243) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:471)
 ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
           ... 13 more
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to