ideal opened a new issue, #7739:
URL: https://github.com/apache/iceberg/issues/7739
### Apache Iceberg version
1.2.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
With Spark 3.2.4 standalone mode,and the table:
```
> desc extended my_test_table;
23/05/30 20:21:47 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout
does not exist
23/05/30 20:21:47 WARN HiveConf: HiveConf of name hive.stats.retries.wait
does not exist
a string a
b string b
c string c
# Detailed Table Information
Database my_test_iceberg
Table my_test_table
Owner root
Created Time Tue May 30 16:28:17 CST 2023
Last Access UNKNOWN
Created By Spark 2.2 or prior
Type EXTERNAL
Provider hive
Comment my_test_table
Table Properties
[current-schema={"type":"struct","schema-id":0,"fields":[{"id":1,"name":"a","required":false,"type":"string","doc":"a"},{"id":2,"name":"b","required":false,"type":"string","doc":"b"},{"id":3,"name":"c","required":false,"type":"string","doc":"c"}]},
current-snapshot-id=1142796867698349657,
current-snapshot-summary={"spark.app.id":"app-20230530113600-0013","added-data-files":"1","added-records":"1","added-files-size":"860","changed-partition-count":"1","total-records":"24","total-files-size":"20654","total-data-files":"24","total-delete-files":"0","total-position-deletes":"0","total-equality-deletes":"0"},
current-snapshot-timestamp-ms=1685438806641,
default-partition-spec={"spec-id":0,"fields":[{"name":"c","transform":"identity","source-id":3,"field-id":1000}]},
engine.hive.enabled=true, external.table.purge=TRUE,
metadata_location=hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table/metadata/00024-665493b0-a47d-4861-8ebd-767f868f8fda.metadata.json,
prev
ious_metadata_location=hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table/metadata/00023-87b8cb31-15ab-45cb-9b73-d8085549e2c1.metadata.json,
snapshot-count=24,
storage_handler=org.apache.iceberg.mr.hive.HiveIcebergStorageHandler,
table_type=ICEBERG, transient_lastDdlTime=1685435297,
uuid=5d212398-0457-4058-b400-936e0533fcd6]
Statistics 20654 bytes, 24 rows
Location
hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table
Serde Library org.apache.iceberg.mr.hive.HiveIcebergSerDe
InputFormat org.apache.iceberg.mr.hive.HiveIcebergInputFormat
OutputFormat org.apache.iceberg.mr.hive.HiveIcebergOutputFormat
Partition Provider Catalog
```
And running:
```
bin/spark-sql --master spark://${spark-master}:7077 --conf
spark.driver.host=${current-host-ip} --conf
spark.hive.metastore.uris=thrift://${metastore-service}:9083 --conf
spark.sql.catalog.hive_prod=org.apache.iceberg.spark.SparkCatalog --conf
spark.sql.catalog.hive_prod.type=hive --conf
spark.sql.catalog.hive_prod.warehouse=hdfs://xxxx/hivewarehouse/iceberg --jars
iceberg-hive-runtime-1.2.1.jar,iceberg-spark-runtime-3.2_2.12-1.2.1.jar
> insert into my_test_table values ('a1','b1','c1');
```
The exception is like this:
```
org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:274)
at
org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:132)
at
org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:105)
at
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:161)
at
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:146)
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:290)
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$16(FileFormatWriter.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.NullPointerException
at
org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper.<init>(TezUtil.java:105)
at
org.apache.iceberg.mr.hive.TezUtil.taskAttemptWrapper(TezUtil.java:78)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.writer(HiveIcebergOutputFormat.java:73)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.getHiveRecordWriter(HiveIcebergOutputFormat.java:58)
at
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:286)
at
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:271)
... 14 more
```
Does anyone had this problem before? Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]