NeQuissimus opened a new issue, #7034: URL: https://github.com/apache/iceberg/issues/7034
### Apache Iceberg version 1.1.0 (latest release) ### Query engine Hive ### Please describe the bug 🐞 # Context For context, this conversation started on Slack but we decided to move it here for better visibility. cc @SinghAsDev I believe this to be related to https://github.com/apache/iceberg/pull/5036 # Code We have code using the API directly, no Spark, no Trickle etc. The code looks a little something like this and is driven by a Kafka consumer: ```scala // This is an object we keep around def catalog: HiveCatalog // for each message from Kafka, do the following val table: Table = catalog.loadTable() val transaction: Transaction = table.newTransaction() val append = transaction.newFastAppend() // add all data files from the Kafka message to `append` ... transaction.commitTransaction() ``` # Observations With Iceberg 0.14 (and 1.0.0, but we have not tested this extensively; enough to say the issue is not present there), we have a pretty steady state of threads running. Fetching a heap dump gives us maybe 150 threads across everything inside our application. Once we update to Iceberg 1.1.0, we not only see the number of threads steadily increasing but also increasing with no obvious bound. (After a few hours, we see about 30,000 extra threads :D) All of these threads are named `iceberg-hive-lock-heartbeat-0`, which is why I was looking at #5036 immediately and it is also a new change in Iceberg 1.1.0. My understanding is that the `Transaction` essentially relates back to `HiveTableOperations.doCommit`. I do not see anything in `HiveTableOperations` shutting down the scheduler for the Hive pings. But I am not sure whether that would even be necessary. There are no new `close()` methods I could find on any of the objects we create either. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
