lk-1984 opened a new issue, #12648:
URL: https://github.com/apache/iceberg/issues/12648

   ### Apache Iceberg version
   
   1.8.1 (latest release)
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   When I create the Iceberg table with Trino, and then deploy the Iceberg 
Kafka Connect, it can read partition of a Kafka topic, and fetch the 
corresponding Avro schema from the Schema Registry, and deserialise the data, 
and then make an insert into the Iceberg table.
   
   But when I enable "iceberg.tables.auto-create-enabled" things go wrong.
   
   Then the Iceberg Kafka Connect Sink uses thrift, and it is trying to create 
the table. But it is trying to write it into the local file system of the Hive 
metastore, and obviously, there is no writing permissions for the root of the 
Hive metastore's file system.
   
   The path that is being used is "/user/hive/warehouse/...", which is the old 
default Hadoop FS warehouse location for Hive tables, not Iceberg tables.
   
   I have tested this with very simple Avro schema that has just single field. 
When I create the Iceberg table in advance, everything goes well, metadata and 
data is written into the object storage. When I enable 
"iceberg.tables.auto-create-enabled", then it goes wrong.
   
   Example of such error from the Iceberg Kafka Connect Sink. Note that the 
actual error is coming from through the thrift protocol, from the Hive 
metastore.
   
   Somehow that flag now affects on where the Hive metastore is trying to write 
the metadata.
   
   ```
   2025-03-25 19:45:19 [2025-03-25 17:45:19,138] ERROR 
[iceberg-sink-connector-foobar|task-2] 
WorkerSinkTask{id=iceberg-sink-connector-foobar-2} Task threw an uncaught and 
unrecoverable exception. Task is being killed and will not recover until 
manually restarted (org.apache.kafka.connect.runtime.WorkerTask:234)
   2025-03-25 19:45:19 org.apache.kafka.connect.errors.ConnectException: 
Exiting WorkerSinkTask due to unrecoverable exception.
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:636)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:345)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:247)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:216)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:226)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:281)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:238)
   2025-03-25 19:45:19     at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
   2025-03-25 19:45:19     at 
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
   2025-03-25 19:45:19     at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
   2025-03-25 19:45:19     at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
   2025-03-25 19:45:19     at java.base/java.lang.Thread.run(Unknown Source)
   2025-03-25 19:45:19 Caused by: 
org.apache.iceberg.exceptions.RuntimeIOException: Failed to create file: 
file:/user/hive/warehouse/foobar/metadata/00000-31ce7c33-c1fd-425f-a7e4-f90c8e829f96.metadata.json
   2025-03-25 19:45:19     at 
org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:87)
   2025-03-25 19:45:19     at 
org.apache.iceberg.TableMetadataParser.internalWrite(TableMetadataParser.java:125)
   2025-03-25 19:45:19     at 
org.apache.iceberg.TableMetadataParser.overwrite(TableMetadataParser.java:115)
   2025-03-25 19:45:19     at 
org.apache.iceberg.BaseMetastoreTableOperations.writeNewMetadata(BaseMetastoreTableOperations.java:160)
   2025-03-25 19:45:19     at 
org.apache.iceberg.BaseMetastoreTableOperations.writeNewMetadataIfRequired(BaseMetastoreTableOperations.java:150)
   2025-03-25 19:45:19     at 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:174)
   2025-03-25 19:45:19     at 
org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:125)
   2025-03-25 19:45:19     at 
org.apache.iceberg.BaseMetastoreCatalog$BaseMetastoreCatalogTableBuilder.create(BaseMetastoreCatalog.java:201)
   2025-03-25 19:45:19     at 
org.apache.iceberg.hive.HiveCatalog$ViewAwareTableBuilder.create(HiveCatalog.java:858)
   2025-03-25 19:45:19     at 
org.apache.iceberg.catalog.Catalog.createTable(Catalog.java:75)
   2025-03-25 19:45:19     at 
org.apache.iceberg.catalog.Catalog.createTable(Catalog.java:93)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.IcebergWriterFactory.lambda$autoCreateTable$0(IcebergWriterFactory.java:112)
   2025-03-25 19:45:19     at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
   2025-03-25 19:45:19     at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
   2025-03-25 19:45:19     at 
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
   2025-03-25 19:45:19     at 
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.IcebergWriterFactory.autoCreateTable(IcebergWriterFactory.java:108)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.IcebergWriterFactory.createWriter(IcebergWriterFactory.java:62)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.SinkWriter.lambda$writerForTable$3(SinkWriter.java:139)
   2025-03-25 19:45:19     at 
java.base/java.util.HashMap.computeIfAbsent(Unknown Source)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.SinkWriter.writerForTable(SinkWriter.java:138)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.SinkWriter.lambda$routeRecordStatically$1(SinkWriter.java:98)
   2025-03-25 19:45:19     at 
java.base/java.util.Arrays$ArrayList.forEach(Unknown Source)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.SinkWriter.routeRecordStatically(SinkWriter.java:96)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:85)
   2025-03-25 19:45:19     at java.base/java.util.ArrayList.forEach(Unknown 
Source)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.data.SinkWriter.save(SinkWriter.java:68)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.channel.Worker.save(Worker.java:124)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.channel.CommitterImpl.save(CommitterImpl.java:88)
   2025-03-25 19:45:19     at 
org.apache.iceberg.connect.IcebergSinkTask.put(IcebergSinkTask.java:87)
   2025-03-25 19:45:19     at 
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:606)
   2025-03-25 19:45:19     ... 11 more
   2025-03-25 19:45:19 Caused by: java.io.IOException: Mkdirs failed to create 
file:/user/hive/warehouse/foobar/metadata (exists=false, cwd=file:/home/appuser)
   2025-03-25 19:45:19     at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:715)
   2025-03-25 19:45:19     at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:700)
   2025-03-25 19:45:19     at 
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
   2025-03-25 19:45:19     at 
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1210)
   2025-03-25 19:45:19     at 
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1091)
   2025-03-25 19:45:19     at 
org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:85)
   2025-03-25 19:45:19     ... 41 more
   ```
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to