[ 
https://issues.apache.org/jira/browse/HIVE-23061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063764#comment-17063764
 ] 

Prasanth Jayachandran commented on HIVE-23061:
----------------------------------------------

{quote}I suspect that on that error, that fragment is cleaned up which may 
clear the info for the first fragment, and when the first fragment exits it may 
hit this. Still need more investigation on this one.
{quote}
This will be a problem always right? The duplicated fragment cannot cleanup the 
original/first task fragment.. 

I am ok with the fix but what said above is in fact happening then that has to 
be addressed as it can lead to more issues which will be hard to debug.

Llap ignoring duplicates is correct. But clients has to handle it based on the 
reason for rejection. If the reason is duplicate then client should not trigger 
taskCleanup.

> LLAP crash due to unhandled exception: Cannot invoke unregister on an entity 
> which has not been registered
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-23061
>                 URL: https://issues.apache.org/jira/browse/HIVE-23061
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>         Attachments: HIVE-23061.1.patch
>
>
> The following exception goes uncaught and causes the entire LLAP daemon to 
> shut down:
> {noformat}
> 2020-03-17T06:49:11,304 ERROR [ExecutionCompletionThread #0 ()] 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread 
> Thread[ExecutionCompletionThread #0,5,main] threw an Exception. Shutting down 
> now...
> java.lang.IllegalStateException: Cannot invoke unregister on an entity which 
> has not been registered
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:508) 
> ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.unregisterForUpdates(QueryInfo.java:256)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.QueryInfo.unregisterFinishableStateUpdate(QueryInfo.java:209)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.QueryFragmentInfo.unregisterForFinishableStateUpdates(QueryFragmentInfo.java:166)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$TaskWrapper.maybeUnregisterForFinishedStateNotifications(TaskExecutorService.java:1177)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:980)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:944)
>  ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
>     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1021)
>  ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_191]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_191]
>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to