[ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879895#comment-15879895
 ] 

Siddharth Seth commented on HIVE-15958:
---------------------------------------

I suspect the following is the reason for connections not being re-used for 
taskKilled.
For regular heartbeats, only one session will ever run for an AM - and this is 
controlled via the QueueCallable / HeartbeatCallable. When taskKilled comes 
into play, it is possible for a taskKilled to get a handle on the umbilical, 
and have one of the queued threads close the umbilical right after that, 
resulting in an error.

We have that situation again. More prominent now - since queryComplete causes 
fragments to be killed (should probably not be done - HIVE-16021), which in 
turn result in a heartbeat. The queryComplete closes the umbilical, while 
taskKilled requests get scheduled.

Also, iterating over the knownAppMasters is very avoidable. We can store 
information about the AM in the queryTracker, and retrieve it on queryComplete. 
Alternately send the AM information on the queryComplete call.

> LLAP: IPC connections are not being reused for umbilical protocol
> -----------------------------------------------------------------
>
>                 Key: HIVE-15958
>                 URL: https://issues.apache.org/jira/browse/HIVE-15958
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>    Affects Versions: 2.2.0
>            Reporter: Rajesh Balamohan
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to