[ 
https://issues.apache.org/jira/browse/DRILL-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285341#comment-15285341
 ] 

ASF GitHub Bot commented on DRILL-4676:
---------------------------------------

Github user hnfgns commented on a diff in the pull request:

    https://github.com/apache/drill/pull/503#discussion_r63428428
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
    @@ -1221,6 +1223,12 @@ public void interrupted(final InterruptedException 
e) {
          *   to the user
          */
         public void moveToState(final QueryState newState, final Exception ex) 
{
    +      // if the current thread is the foreman thread, throw an exception
    +      // otherwise the foreman will be blocked forever on 
acceptExternalEvents
    +      if (myThreadRef == Thread.currentThread()) {
    --- End diff --
    
    Well, Foreman does not call moveToState in common path but when it fails 
then it could. My point is instead of hacking the method to throw an exception 
if the thread is foreman we should release the latch and handle the exception 
in foreman#run. What I mean is
    
    Foreman#run() { 
    ```
    try {
      do sth
    } catch(ex) {
      release latch 
      handleException(ex)
    } finally {
      ...
    }
    ```
    
    



> Foreman.moveToState can block forever if called by the foreman thread while 
> the query is still being setup
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-4676
>                 URL: https://issues.apache.org/jira/browse/DRILL-4676
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.6.0
>            Reporter: Deneche A. Hakim
>            Assignee: Sudheesh Katkam
>             Fix For: 1.7.0
>
>
> When the query is being setup, foreman has a special CountDownLatch that 
> blocks rpc threads from delivering external events, this latch is unblocked 
> at the end of the query setup.
> In some cases though, when the foreman is submitting remote fragments, a 
> failure in RpcBus.send() causes an exception to be thrown that is reported to 
> Foreman.FragmentSubmitListener and blocks in the CountDownLatch. This causes 
> the foreman thread to block forever, and can rpc threads to be blocked too.
> This seems to happen more frequently at a high concurrency load, and also can 
> prevent clients from connecting to the Drillbits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to