[ 
https://issues.apache.org/jira/browse/HIVE-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-28759:
--------------------------------
    Fix Version/s: 4.1.0

> Hive Query History - records are failed to be written due to iceberg worker 
> pools shut down
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-28759
>                 URL: https://issues.apache.org/jira/browse/HIVE-28759
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>
> on HS2 shutdown, the service attempts to write the leftover records, but by 
> that time, shutdown() was called on the iceberg worker pools, so they don't 
> accept new tasks, hence commit fails
> iceberg worker pools (used by the 
> [SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
>  are wrapped as exiting executor services
> https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133
> exiting executor service means (this is Guava's MoreExecutors, which is also 
> shaded in iceberg):
> {code}
>     final void addDelayedShutdownHook(final ExecutorService service, final 
> long terminationTimeout, final TimeUnit timeUnit) {
>       Preconditions.checkNotNull(service);
>       Preconditions.checkNotNull(timeUnit);
>       this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-" 
> + service, new Runnable(this) {
>         public void run() {
>           try {
>             service.shutdown();
>             service.awaitTermination(terminationTimeout, timeUnit);
>           } catch (InterruptedException var2) {
>           }
>         }
>       }));
>     }
> {code}
> this is something hive cannot control, so another workaround is needed, which 
> won't be easy, because iceberg does everything through its Tasks utility, 
> using these pools
> {code}
> hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1 
> d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2" 
> level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
> java.util.concurrent.CompletionException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
> job_1739354584980_0000 for table: sys.query_history
>     at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
>     at 
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:829)
>     at 
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
> job_1739354584980_0000 for table: sys.query_history
>     at 
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
>     at 
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
>     at 
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
>     ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing 
> job: job_1739354584980_0000 for table: sys.query_history
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
>     at 
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
>     ... 6 more
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.FutureTask@1b1377cb[Not completed, task = 
> java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task = 
> org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from 
> java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
>     at 
> java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
>     at 
> java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
>     at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
>     at 
> org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
>     at 
> org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
>     at 
> org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
>     at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
>     at 
> org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
>     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
>     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
>     at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
>     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
>     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
>     ... 7 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to