[
https://issues.apache.org/jira/browse/HIVE-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-28759:
--------------------------------
Fix Version/s: 4.1.0
> Hive Query History - records are failed to be written due to iceberg worker
> pools shut down
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-28759
> URL: https://issues.apache.org/jira/browse/HIVE-28759
> Project: Hive
> Issue Type: Sub-task
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> on HS2 shutdown, the service attempts to write the leftover records, but by
> that time, shutdown() was called on the iceberg worker pools, so they don't
> accept new tasks, hence commit fails
> iceberg worker pools (used by the
> [SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
> are wrapped as exiting executor services
> https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133
> exiting executor service means (this is Guava's MoreExecutors, which is also
> shaded in iceberg):
> {code}
> final void addDelayedShutdownHook(final ExecutorService service, final
> long terminationTimeout, final TimeUnit timeUnit) {
> Preconditions.checkNotNull(service);
> Preconditions.checkNotNull(timeUnit);
> this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-"
> + service, new Runnable(this) {
> public void run() {
> try {
> service.shutdown();
> service.awaitTermination(terminationTimeout, timeUnit);
> } catch (InterruptedException var2) {
> }
> }
> }));
> }
> {code}
> this is something hive cannot control, so another workaround is needed, which
> won't be easy, because iceberg does everything through its Tasks utility,
> using these pools
> {code}
> hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1
> d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2"
> level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
> java.util.concurrent.CompletionException: java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
> job_1739354584980_0000 for table: sys.query_history
> at
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
> at
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> at
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
> Caused by: java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
> job_1739354584980_0000 for table: sys.query_history
> at
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
> at
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
> ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing
> job: job_1739354584980_0000 for table: sys.query_history
> at
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
> at
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
> ... 6 more
> Caused by: java.util.concurrent.RejectedExecutionException: Task
> java.util.concurrent.FutureTask@1b1377cb[Not completed, task =
> java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task =
> org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from
> java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0,
> active threads = 0, queued tasks = 0, completed tasks = 0]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
> at
> java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
> at
> java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
> at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
> at
> org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
> at
> org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
> at
> org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
> at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
> at
> org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
> at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
> at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
> at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
> at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
> at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
> at
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
> ... 7 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)