[ 
https://issues.apache.org/jira/browse/DRILL-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285380#comment-15285380
 ] 

ASF GitHub Bot commented on DRILL-4676:
---------------------------------------

Github user adeneche commented on a diff in the pull request:

    https://github.com/apache/drill/pull/503#discussion_r63431381
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
    @@ -1221,6 +1223,12 @@ public void interrupted(final InterruptedException 
e) {
          *   to the user
          */
         public void moveToState(final QueryState newState, final Exception ex) 
{
    +      // if the current thread is the foreman thread, throw an exception
    +      // otherwise the foreman will be blocked forever on 
acceptExternalEvents
    +      if (myThreadRef == Thread.currentThread()) {
    --- End diff --
    
    We were assuming that the Foreman thread would never call 
Foreman.StateListener.moveToState() and it would be called by another (rpc) 
thread instead.
    It turns out when the foreman is submitting remote fragments, RpcBus.send() 
could actually cause the foreman thread to call moveToState directly


> Foreman.moveToState can block forever if called by the foreman thread while 
> the query is still being setup
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-4676
>                 URL: https://issues.apache.org/jira/browse/DRILL-4676
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.6.0
>            Reporter: Deneche A. Hakim
>            Assignee: Sudheesh Katkam
>             Fix For: 1.7.0
>
>
> When the query is being setup, foreman has a special CountDownLatch that 
> blocks rpc threads from delivering external events, this latch is unblocked 
> at the end of the query setup.
> In some cases though, when the foreman is submitting remote fragments, a 
> failure in RpcBus.send() causes an exception to be thrown that is reported to 
> Foreman.FragmentSubmitListener and blocks in the CountDownLatch. This causes 
> the foreman thread to block forever, and can rpc threads to be blocked too.
> This seems to happen more frequently at a high concurrency load, and also can 
> prevent clients from connecting to the Drillbits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to