tjdean opened a new issue #2253: URL: https://github.com/apache/iceberg/issues/2253
## Issue `BaseParquetReaders` reads dates/times/timestamps into memory as `Temporals`, but their types map to `Numbers`. When residuals are evaluated on those types, the accessor tries to cast the temporal to a number and fails with an illegal state. https://github.com/apache/iceberg/blob/6bf58e5037c22e4c6255c4bf01109fb45651ce6b/parquet/src/main/java/org/apache/iceberg/data/parquet/BaseParquetReaders.java#L326-L346 https://github.com/apache/iceberg/blob/6bf58e5037c22e4c6255c4bf01109fb45651ce6b/api/src/main/java/org/apache/iceberg/types/Type.java#L35-L37 ## Example Note that I am using Hive 3 which, [according to the docs, is not supported](https://iceberg.apache.org/hive/#hive-read-support), but this error occurs outside of the Hive version specific object inspectors. ``` 0: jdbc:hive2://> SELECT d_date FROM TPCDS_PARQUET_2_ICE.DATE_DIM WHERE d_date = '2000-08-05'; INFO : Compiling command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e): SELECT d_date FROM TPCDS_PARQUET_2_ICE.DATE_DIM WHERE d_date = '2000-08-05' INFO : No Stats for tpcds_parquet_2_ice@date_dim, Columns: d_date INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:d_date, type:date, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e); Time taken: 7.886 seconds INFO : Executing command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e): SELECT d_date FROM TPCDS_PARQUET_2_ICE.DATE_DIM WHERE d_date = '2000-08-05' INFO : Completed executing command(queryId=hive_20210218150052_eb87c99c-d57d-41da-8c78-ccb07556bb1e); Time taken: 0.12 seconds INFO : OK Error: java.io.IOException: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 1900-01-02 (state=,code=0) ``` ## Stack ``` Caused by: java.io.IOException: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 1900-01-02 at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:640) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:547) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:880) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:243) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:471) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] ... 13 more Caused by: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 1900-01-02 at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:57) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:44) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.BoundReference.eval(BoundReference.java:39) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.Evaluator$EvalVisitor.eq(Evaluator.java:121) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.Evaluator$EvalVisitor.eq(Evaluator.java:51) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.ExpressionVisitors$BoundVisitor.predicate(ExpressionVisitors.java:229) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.ExpressionVisitors.visitEvaluator(ExpressionVisitors.java:322) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.Evaluator$EvalVisitor.eval(Evaluator.java:56) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.Evaluator$EvalVisitor.access$100(Evaluator.java:51) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.expressions.Evaluator.eval(Evaluator.java:48) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.lambda$applyResidualFiltering$0(IcebergInputFormat.java:283) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.io.CloseableIterable$3.shouldKeep(CloseableIterable.java:82) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.io.FilterIterator.advance(FilterIterator.java:67) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.io.FilterIterator.hasNext(FilterIterator.java:50) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextKeyValue(IcebergInputFormat.java:197) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.next(MapredIcebergInputFormat.java:104) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.next(MapredIcebergInputFormat.java:81) ~[iceberg-hive-runtime-0.10.0.jar:?] at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:607) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:547) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:880) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:243) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:471) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] ... 13 more ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
