[ 
https://issues.apache.org/jira/browse/DRILL-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2582:
--------------------------------
    Fix Version/s:     (was: 1.1.0)
                   1.2.0

> QueryManager shouldn't be manipulating Foreman's state directly
> ---------------------------------------------------------------
>
>                 Key: DRILL-2582
>                 URL: https://issues.apache.org/jira/browse/DRILL-2582
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 0.8.0
>            Reporter: Chris Westin
>            Assignee: Deneche A. Hakim
>             Fix For: 1.2.0
>
>
> We're having trouble always reporting cascading failures that result from a 
> failure or cancellation, and this turns out to be because QueryManager is 
> indiscriminately manipulating Foreman's state without paying any attention to 
> its current state.
> For example, suppose we request a cancellation of a query, and Foreman issues 
> queryManager.cancelExecutingFragments. However, in the meantime, suppose a 
> fragment failed. The fragment failure will be picked up by 
> QueryManager.statusUpdate(), which then uses stateListener to slam Foreman to 
> the FAILED state. However, Foreman was in CANCELLATION_REQUESTED, and is 
> waiting for the cancellation acknowledgements. The sudden move to FAILED 
> shuts it down. The Foreman will still send out a CANCELED terminal state, but 
> won't report the failure or any cascading failure from the cancellations.
> What should happen is that QueryManager should instead report on fragment 
> status updates to Foreman, and Foreman should decide what transition to make 
> based on the fragment status update and it's own current state. In the above, 
> a fragment failure notification after we're already in CANCELLATION_REQUESTED 
> shouldn't result in any state transition at all, but should simply attach the 
> fragment failure to any current suppressed deferred exceptions. This means 
> QueryManager.statusUpdate() and QueryManager.fragmentDone() need to be 
> reworked, and Foreman needs to give QueryManager a listener for reporting 
> fragment status changes, rather than allowing it to directly manipulate the 
> Foreman's state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to