[ 
https://issues.apache.org/jira/browse/HIVE-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-28759:
--------------------------------
    Description: 
on HS2 shutdown, the service attempts to write the leftover records, but by 
that time, shutdown() was called on the iceberg worker pools, so they don't 
accept new tasks, hence commit fails

iceberg worker pools (used by the 
[SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
 are wrapped as exiting executor services
https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133

exiting executor service means:
{code}

    final void addDelayedShutdownHook(final ExecutorService service, final long 
terminationTimeout, final TimeUnit timeUnit) {
      Preconditions.checkNotNull(service);
      Preconditions.checkNotNull(timeUnit);
      this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-" + 
service, new Runnable(this) {
        public void run() {
          try {
            service.shutdown();
            service.awaitTermination(terminationTimeout, timeUnit);
          } catch (InterruptedException var2) {
          }

        }
      }));
    }

{code}

this is something hive cannot control, so another workaround is needed, which 
won't be easy, because iceberg does everything through its Tasks utility, using 
these pools
{code}
hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1 
d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2" 
level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
java.util.concurrent.CompletionException: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
job_1739354584980_0000 for table: sys.query_history
    at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
    at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
    at 
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
    at 
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
job_1739354584980_0000 for table: sys.query_history
    at 
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
    at 
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
    at 
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
    ... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing 
job: job_1739354584980_0000 for table: sys.query_history
    at 
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
    at 
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
    ... 6 more
Caused by: java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.FutureTask@1b1377cb[Not completed, task = 
java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task = 
org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from 
java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0, 
active threads = 0, queued tasks = 0, completed tasks = 0]
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
    at 
java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
    at 
java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
    at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at 
org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
    at 
org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
    at 
org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
    at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
    at 
org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
    at 
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
    ... 7 more

{code}

  was:
on HS2 shutdown, the service attempts to write the leftover records, but by 
that time, shutdown() was called on the iceberg worker pools, so they don't 
accept new tasks, hence commit fails

iceberg worker pools (used by the 
[SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
 are wrapped as exiting executor services
https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133

exiting executor service means:
{code}

    final void addDelayedShutdownHook(final ExecutorService service, final long 
terminationTimeout, final TimeUnit timeUnit) {
      Preconditions.checkNotNull(service);
      Preconditions.checkNotNull(timeUnit);
      this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-" + 
service, new Runnable(this) {
        public void run() {
          try {
            service.shutdown();
            service.awaitTermination(terminationTimeout, timeUnit);
          } catch (InterruptedException var2) {
          }

        }
      }));
    }

{code}

this is something hive cannot control, so another workaround needed
{code}
hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1 
d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2" 
level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
java.util.concurrent.CompletionException: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
job_1739354584980_0000 for table: sys.query_history
    at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
    at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
    at 
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
    at 
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
job_1739354584980_0000 for table: sys.query_history
    at 
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
    at 
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
    at 
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
    ... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing 
job: job_1739354584980_0000 for table: sys.query_history
    at 
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
    at 
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
    ... 6 more
Caused by: java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.FutureTask@1b1377cb[Not completed, task = 
java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task = 
org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from 
java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0, 
active threads = 0, queued tasks = 0, completed tasks = 0]
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
    at 
java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
    at 
java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
    at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at 
org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
    at 
org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
    at 
org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
    at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
    at 
org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
    at 
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
    at 
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
    ... 7 more

{code}


> Hive Query History - records are failed to be written due to iceberg worker 
> pools shut down
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-28759
>                 URL: https://issues.apache.org/jira/browse/HIVE-28759
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>
> on HS2 shutdown, the service attempts to write the leftover records, but by 
> that time, shutdown() was called on the iceberg worker pools, so they don't 
> accept new tasks, hence commit fails
> iceberg worker pools (used by the 
> [SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
>  are wrapped as exiting executor services
> https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133
> exiting executor service means:
> {code}
>     final void addDelayedShutdownHook(final ExecutorService service, final 
> long terminationTimeout, final TimeUnit timeUnit) {
>       Preconditions.checkNotNull(service);
>       Preconditions.checkNotNull(timeUnit);
>       this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-" 
> + service, new Runnable(this) {
>         public void run() {
>           try {
>             service.shutdown();
>             service.awaitTermination(terminationTimeout, timeUnit);
>           } catch (InterruptedException var2) {
>           }
>         }
>       }));
>     }
> {code}
> this is something hive cannot control, so another workaround is needed, which 
> won't be easy, because iceberg does everything through its Tasks utility, 
> using these pools
> {code}
> hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1 
> d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2" 
> level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
> java.util.concurrent.CompletionException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
> job_1739354584980_0000 for table: sys.query_history
>     at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
>     at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
>     at 
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:829)
>     at 
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job: 
> job_1739354584980_0000 for table: sys.query_history
>     at 
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
>     at 
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
>     at 
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
>     ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing 
> job: job_1739354584980_0000 for table: sys.query_history
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
>     at 
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
>     ... 6 more
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.FutureTask@1b1377cb[Not completed, task = 
> java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task = 
> org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from 
> java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0, 
> active threads = 0, queued tasks = 0, completed tasks = 0]
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
>     at 
> java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
>     at 
> java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
>     at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
>     at 
> org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
>     at 
> org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
>     at 
> org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
>     at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
>     at 
> org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
>     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
>     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
>     at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
>     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
>     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
>     ... 7 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to