massdosage commented on pull request #1267:
URL: https://github.com/apache/iceberg/pull/1267#issuecomment-665760682
@guilload I've taken the `iceberg-mr-all.jar` that the above produces and
added it to a Hive client's classpath by doing the following:
```
hive> add jar /home/hadoop/iceberg/0.9.0-SNAPSHOT/iceberg-mr-all.jar;
Added [/home/hadoop/iceberg/0.9.0-SNAPSHOT/iceberg-mr-all.jar] to class path
```
I've then created a Hive table on top of an existing Iceberg table by doing
the following:
```
CREATE EXTERNAL TABLE default.iceberg_table_a STORED BY
'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' LOCATION
'hdfs://host:port/hiveberg/table_a';
```
I can successfully perform a `SELECT *` from this but if I add an `ORDER BY`
clause to force a Map Reduce job to execute it fails with the following error:
```
Error: java.io.IOException: java.lang.NullPointerException: Table cannot be
null
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:379)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:678)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:170)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.NullPointerException: Table cannot be null
at
org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
at
org.apache.iceberg.mr.hive.HiveIcebergInputFormat.forwardConfigSettings(HiveIcebergInputFormat.java:76)
at
org.apache.iceberg.mr.hive.HiveIcebergInputFormat.getRecordReader(HiveIcebergInputFormat.java:63)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376)
... 9 more
```
I haven't had time to look into it in depth but I know we had this working
in Hiveberg so there is something in the new InputFormat that is failing. I
tried removing the null checks and adding "if not null" checks in the code
below them but the NPE then just moves further down in the code:
```
Caused by: java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2296)
at
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:111)
at
com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:54)
at org.apache.iceberg.SchemaParser.fromJson(SchemaParser.java:247)
at
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:181)
at
org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.<init>(MapredIcebergInputFormat.java:92)
at
org.apache.iceberg.mr.mapred.MapredIcebergInputFormat.getRecordReader(MapredIcebergInputFormat.java:78)
at
org.apache.iceberg.mr.hive.HiveIcebergInputFormat.getRecordReader(HiveIcebergInputFormat.java:64)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376)
... 26 more
```
You said you had this working via a Hive client, what have I done
differently that I'm running into all these exceptions?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]