[ 
https://issues.apache.org/jira/browse/HIVE-23061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063690#comment-17063690
 ] 

Prasanth Jayachandran commented on HIVE-23061:
----------------------------------------------

{quote}Actually all LlapDaemonUncaughtExceptionHandler does is shut down the 
LLAP daemon down, if it receives an exception .. is that incorrect logic?
{quote}
This is expected. We cannot recover but we want to log and see why it happened 
and how to avoid it. I am wondering if we need a catch in onSuccess and 
taskCleanup notifications handlers because the uncaught exceptions are already 
logging it. We seem to catch and ignore it but the exception itself should not 
have happened in first place? Do we know why there was an exception?

Especially, "java.lang.IllegalStateException: Cannot invoke unregister on an 
entity which has not been registered"

is there a race condition or double notification issue here?

 

> LLAP crash due to unhandled exception: Cannot invoke unregister on an entity 
> which has not been registered
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-23061
>                 URL: https://issues.apache.org/jira/browse/HIVE-23061
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>         Attachments: HIVE-23061.1.patch
>
>
> The following exception goes uncaught and causes the entire LLAP daemon to 
> shut down:
> {noformat}
> 2020-03-17T06:49:11,304 ERROR [ExecutionCompletionThread #0 ()] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread 
> Thread[ExecutionCompletionThread #0,5,main] threw an Exception. Shutting down 
> now...
> java.lang.IllegalStateException: Cannot invoke unregister on an entity which 
> has not been registered
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:508) 
> ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.unregisterForUpdates(QueryInfo.java:256)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.QueryInfo.unregisterFinishableStateUpdate(QueryInfo.java:209)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.QueryFragmentInfo.unregisterForFinishableStateUpdates(QueryFragmentInfo.java:166)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$TaskWrapper.maybeUnregisterForFinishedStateNotifications(TaskExecutorService.java:1177)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:980)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:944)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1021)
>  ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_191]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_191]
>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to