[
https://issues.apache.org/jira/browse/HIVE-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-28759:
--------------------------------
Description:
on HS2 shutdown, the service attempts to write the leftover records, but by
that time, shutdown() was called on the iceberg worker pools, so they don't
accept new tasks, hence commit fails
iceberg worker pools (used by the
[SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
are wrapped as exiting executor services
https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133
exiting executor service means:
{code}
final void addDelayedShutdownHook(final ExecutorService service, final long
terminationTimeout, final TimeUnit timeUnit) {
Preconditions.checkNotNull(service);
Preconditions.checkNotNull(timeUnit);
this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-" +
service, new Runnable(this) {
public void run() {
try {
service.shutdown();
service.awaitTermination(terminationTimeout, timeUnit);
} catch (InterruptedException var2) {
}
}
}));
}
{code}
this is something hive cannot control, so another workaround is needed, which
won't be easy, because iceberg does everything through its Tasks utility, using
these pools
{code}
hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1
d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2"
level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
java.util.concurrent.CompletionException: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
job_1739354584980_0000 for table: sys.query_history
at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
at
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
Caused by: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
job_1739354584980_0000 for table: sys.query_history
at
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
at
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing
job: job_1739354584980_0000 for table: sys.query_history
at
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
at
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
... 6 more
Caused by: java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.FutureTask@1b1377cb[Not completed, task =
java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task =
org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from
java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 0]
at
java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
at
java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
at
java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
at
java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
at
java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at
org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
at
org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
at
org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
at
org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
at
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
... 7 more
{code}
was:
on HS2 shutdown, the service attempts to write the leftover records, but by
that time, shutdown() was called on the iceberg worker pools, so they don't
accept new tasks, hence commit fails
iceberg worker pools (used by the
[SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
are wrapped as exiting executor services
https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133
exiting executor service means:
{code}
final void addDelayedShutdownHook(final ExecutorService service, final long
terminationTimeout, final TimeUnit timeUnit) {
Preconditions.checkNotNull(service);
Preconditions.checkNotNull(timeUnit);
this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-" +
service, new Runnable(this) {
public void run() {
try {
service.shutdown();
service.awaitTermination(terminationTimeout, timeUnit);
} catch (InterruptedException var2) {
}
}
}));
}
{code}
this is something hive cannot control, so another workaround needed
{code}
hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1
d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2"
level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
java.util.concurrent.CompletionException: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
job_1739354584980_0000 for table: sys.query_history
at
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
at
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
at
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
Caused by: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
job_1739354584980_0000 for table: sys.query_history
at
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
at
org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing
job: job_1739354584980_0000 for table: sys.query_history
at
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
at
org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
... 6 more
Caused by: java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.FutureTask@1b1377cb[Not completed, task =
java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task =
org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from
java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 0]
at
java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
at
java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
at
java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
at
java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
at
java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at
org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
at
org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
at
org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
at
org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
at
org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
at
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
... 7 more
{code}
> Hive Query History - records are failed to be written due to iceberg worker
> pools shut down
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-28759
> URL: https://issues.apache.org/jira/browse/HIVE-28759
> Project: Hive
> Issue Type: Bug
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
>
> on HS2 shutdown, the service attempts to write the leftover records, but by
> that time, shutdown() was called on the iceberg worker pools, so they don't
> accept new tasks, hence commit fails
> iceberg worker pools (used by the
> [SnapshotProducer|https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotProducer.java])
> are wrapped as exiting executor services
> https://github.com/apache/iceberg/blob/8839c9bf1f1d8c9b718f9766302ff8a2018e515f/core/src/main/java/org/apache/iceberg/util/ThreadPools.java#L133
> exiting executor service means:
> {code}
> final void addDelayedShutdownHook(final ExecutorService service, final
> long terminationTimeout, final TimeUnit timeUnit) {
> Preconditions.checkNotNull(service);
> Preconditions.checkNotNull(timeUnit);
> this.addShutdownHook(MoreExecutors.newThread("DelayedShutdownHook-for-"
> + service, new Runnable(this) {
> public void run() {
> try {
> service.shutdown();
> service.awaitTermination(terminationTimeout, timeUnit);
> } catch (InterruptedException var2) {
> }
> }
> }));
> }
> {code}
> this is something hive cannot control, so another workaround is needed, which
> won't be easy, because iceberg does everything through its Tasks utility,
> using these pools
> {code}
> hiveserver2 <11>1 2025-02-12T10:03:06.020Z hiveserver2-0 hiveserver2 1
> d96973ce-8fc5-4cd3-8a60-a68352116ae1 [mdc@38374 class="server.HiveServer2"
> level="ERROR" thread="shutdown-hook-0"] Error stopping queryHistoryService
> java.util.concurrent.CompletionException: java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
> job_1739354584980_0000 for table: sys.query_history
> at
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
> at
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1739)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> at
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService$QueryHistoryThread.run(QueryHistoryService.java:107)
> Caused by: java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Error committing job:
> job_1739354584980_0000 for table: sys.query_history
> at
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:137)
> at
> org.apache.hadoop.hive.ql.queryhistory.QueryHistoryService.lambda$doFlush$1(QueryHistoryService.java:205)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
> ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error committing
> job: job_1739354584980_0000 for table: sys.query_history
> at
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:841)
> at
> org.apache.hadoop.hive.ql.queryhistory.repository.IcebergRepository.flush(IcebergRepository.java:134)
> ... 6 more
> Caused by: java.util.concurrent.RejectedExecutionException: Task
> java.util.concurrent.FutureTask@1b1377cb[Not completed, task =
> java.util.concurrent.Executors$RunnableAdapter@27c48ae8[Wrapped task =
> org.apache.iceberg.util.Tasks$Builder$1@34a69dc9]] rejected from
> java.util.concurrent.ThreadPoolExecutor@7cf253ca[Terminated, pool size = 0,
> active threads = 0, queued tasks = 0, completed tasks = 0]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
> at
> java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
> at
> java.base/java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:714)
> at org.apache.iceberg.util.Tasks$Builder.runParallel(Tasks.java:307)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:201)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
> at
> org.apache.iceberg.ManifestMergeManager.mergeGroup(ManifestMergeManager.java:134)
> at
> org.apache.iceberg.ManifestMergeManager.mergeManifests(ManifestMergeManager.java:83)
> at
> org.apache.iceberg.MergingSnapshotProducer.apply(MergingSnapshotProducer.java:853)
> at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:234)
> at
> org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:384)
> at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
> at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
> at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:382)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitWrite(HiveIcebergOutputCommitter.java:560)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:494)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJobs$4(HiveIcebergOutputCommitter.java:292)
> at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
> at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
> at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
> at
> org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJobs(HiveIcebergOutputCommitter.java:286)
> at
> org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.storageHandlerCommit(HiveIcebergStorageHandler.java:828)
> ... 7 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)