lk-1984 opened a new issue, #12648:
URL: https://github.com/apache/iceberg/issues/12648
### Apache Iceberg version
1.8.1 (latest release)
### Query engine
None
### Please describe the bug 🐞
When I create the Iceberg table with Trino, and then deploy the Iceberg
Kafka Connect, it can read partition of a Kafka topic, and fetch the
corresponding Avro schema from the Schema Registry, and deserialise the data,
and then make an insert into the Iceberg table.
But when I enable "iceberg.tables.auto-create-enabled" things go wrong.
Then the Iceberg Kafka Connect Sink uses thrift, and it is trying to create
the table. But it is trying to write it into the local file system of the Hive
metastore, and obviously, there is no writing permissions for the root of the
Hive metastore's file system.
The path that is being used is "/user/hive/warehouse/...", which is the old
default Hadoop FS warehouse location for Hive tables, not Iceberg tables.
I have tested this with very simple Avro schema that has just single field.
When I create the Iceberg table in advance, everything goes well, metadata and
data is written into the object storage. When I enable
"iceberg.tables.auto-create-enabled", then it goes wrong.
Example of such error from the Iceberg Kafka Connect Sink. Note that the
actual error is coming from through the thrift protocol, from the Hive
metastore.
Somehow that flag now affects on where the Hive metastore is trying to write
the metadata.
```
2025-03-25 19:45:19 [2025-03-25 17:45:19,138] ERROR
[iceberg-sink-connector-foobar|task-2]
WorkerSinkTask{id=iceberg-sink-connector-foobar-2} Task threw an uncaught and
unrecoverable exception. Task is being killed and will not recover until
manually restarted (org.apache.kafka.connect.runtime.WorkerTask:234)
2025-03-25 19:45:19 org.apache.kafka.connect.errors.ConnectException:
Exiting WorkerSinkTask due to unrecoverable exception.
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:636)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:345)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:247)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:216)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:226)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:281)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:238)
2025-03-25 19:45:19 at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
2025-03-25 19:45:19 at
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
2025-03-25 19:45:19 at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
2025-03-25 19:45:19 at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
2025-03-25 19:45:19 at java.base/java.lang.Thread.run(Unknown Source)
2025-03-25 19:45:19 Caused by:
org.apache.iceberg.exceptions.RuntimeIOException: Failed to create file:
file:/user/hive/warehouse/foobar/metadata/00000-31ce7c33-c1fd-425f-a7e4-f90c8e829f96.metadata.json
2025-03-25 19:45:19 at
org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:87)
2025-03-25 19:45:19 at
org.apache.iceberg.TableMetadataParser.internalWrite(TableMetadataParser.java:125)
2025-03-25 19:45:19 at
org.apache.iceberg.TableMetadataParser.overwrite(TableMetadataParser.java:115)
2025-03-25 19:45:19 at
org.apache.iceberg.BaseMetastoreTableOperations.writeNewMetadata(BaseMetastoreTableOperations.java:160)
2025-03-25 19:45:19 at
org.apache.iceberg.BaseMetastoreTableOperations.writeNewMetadataIfRequired(BaseMetastoreTableOperations.java:150)
2025-03-25 19:45:19 at
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:174)
2025-03-25 19:45:19 at
org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:125)
2025-03-25 19:45:19 at
org.apache.iceberg.BaseMetastoreCatalog$BaseMetastoreCatalogTableBuilder.create(BaseMetastoreCatalog.java:201)
2025-03-25 19:45:19 at
org.apache.iceberg.hive.HiveCatalog$ViewAwareTableBuilder.create(HiveCatalog.java:858)
2025-03-25 19:45:19 at
org.apache.iceberg.catalog.Catalog.createTable(Catalog.java:75)
2025-03-25 19:45:19 at
org.apache.iceberg.catalog.Catalog.createTable(Catalog.java:93)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.IcebergWriterFactory.lambda$autoCreateTable$0(IcebergWriterFactory.java:112)
2025-03-25 19:45:19 at
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
2025-03-25 19:45:19 at
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
2025-03-25 19:45:19 at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
2025-03-25 19:45:19 at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.IcebergWriterFactory.autoCreateTable(IcebergWriterFactory.java:108)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.IcebergWriterFactory.createWriter(IcebergWriterFactory.java:62)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.SinkWriter.lambda$writerForTable$3(SinkWriter.java:139)
2025-03-25 19:45:19 at
java.base/java.util.HashMap.computeIfAbsent(Unknown Source)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.SinkWriter.writerForTable(SinkWriter.java:138)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.SinkWriter.lambda$routeRecordStatically$1(SinkWriter.java:98)
2025-03-25 19:45:19 at
java.base/java.util.Arrays$ArrayList.forEach(Unknown Source)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.SinkWriter.routeRecordStatically(SinkWriter.java:96)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:85)
2025-03-25 19:45:19 at java.base/java.util.ArrayList.forEach(Unknown
Source)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:68)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.channel.Worker.save(Worker.java:124)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.channel.CommitterImpl.save(CommitterImpl.java:88)
2025-03-25 19:45:19 at
org.apache.iceberg.connect.IcebergSinkTask.put(IcebergSinkTask.java:87)
2025-03-25 19:45:19 at
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:606)
2025-03-25 19:45:19 ... 11 more
2025-03-25 19:45:19 Caused by: java.io.IOException: Mkdirs failed to create
file:/user/hive/warehouse/foobar/metadata (exists=false, cwd=file:/home/appuser)
2025-03-25 19:45:19 at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:715)
2025-03-25 19:45:19 at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:700)
2025-03-25 19:45:19 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
2025-03-25 19:45:19 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1210)
2025-03-25 19:45:19 at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1091)
2025-03-25 19:45:19 at
org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:85)
2025-03-25 19:45:19 ... 41 more
```
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]