Hi All,

Recently I found an issue (Thanks to knguyen to help with test setup) related 
to Fragment Status reporting and would like some feedback on it.


When a client submits a query to Foreman, then it is planned by Foreman and 
later fragments are scheduled to root and non-root nodes. Foreman creates a 
DriilbitStatusListener and FragmentStatusListener to know about the health of 
Drillbit node and a fragment respectively. The way root and non-root fragments 
are setup by Foreman are different:


  *   Root fragments are setup without any communication over control channel 
(since it is executed locally on Foreman)
  *   Non-root fragments are setup by sending control message 
(REQ_INITIALIZE_FRAGMENTS_VALUE) over wire. If there is failure in sending any 
such control message (like due to network hiccup's) during query setup then the 
query is failed and client is notified.

Each fragment is executed on it's node with the help Fragment Executor which 
has an instance for FragmentStatusReporter. FragmentStatusReporter helps to 
update the status of a fragment to Foreman node over a control tunnel or 
connection using RPC message (REQ_FRAGMENT_STATUS) both for root and non-root 
fragments.

Based on above when root fragment is submitted for setup then it is done 
locally without any RPC communication whereas when status for that fragment is 
reported by fragment executor that happens over control connection by sending a 
RPC message. But for non-root fragment setup and status update both happens 
using RPC message over control connection.

Issue 1:
What was observed is if for a simple query which has only 1 root fragment 
running on Foreman node then setup will work fine. But as part of status update 
when the fragment tries to create a control connection and fails to establish 
that, then the query hangs. This is because the root fragment will complete 
execution but will fail to update Foreman about it and Foreman think that the 
query is running for ever.

Proposed Solution:
For root fragment the setup of fragment is happening locally without RPC 
message, so we can do the same for status update of root fragments. This will 
avoid RPC communication for status update of fragments running locally on the 
foreman and hence will resolve issue 1.

JIRA for this issue: https://issues.apache.org/jira/browse/DRILL-5721

Issue 2:
For complex query where we have root and non-root fragments, if the fragment 
setup was done fine and later during the query execution the control connection 
is lost. But as part of status update, fragments will try to create a new 
Control connection to foreman but let say they fails to do so and eventually 
got completed. Then in this case also Foreman will think that the fragment is 
still running and Query is not completed and can hang the query.

I don't see any timeout logic on the Foreman side for a query in execution 
which might be because we don't know how long a query will take for execution. 
Nor did I see any message from Foreman which keep's check on Fragment status in 
case if it didn't received any update from other fragments for a time interval. 
If not may be we should add the second option or look for other alternatives as 
well. It would be helpful to get some feedback on both the issues and proposed 
solutions. I have opened a JIRA with all the details for both the issues shared 
here:

JIRA for this issue: https://issues.apache.org/jira/browse/DRILL-5722


Thanks,
Sorabh

Reply via email to