Paul Guo created HAWQ-1334:
------------------------------
Summary: QD thread should set error code if failing so that the
main process for the query could exit soon
Key: HAWQ-1334
URL: https://issues.apache.org/jira/browse/HAWQ-1334
Project: Apache HAWQ
Issue Type: Bug
Components: Dispatcher
Reporter: Paul Guo
Assignee: Ed Espino
In QD thread dispmgt_thread_func_run(), if there are failures either due to QE
or QD itself, it will cancel the query and then clean up. The main process for
the query need to have the error code of meleeResults be set so that it soon
proceed to cancel the query, else we have to wait for timeout. Typically
dispmgt_thread_func_run() should set the error code, however I found there are
some cases who do not handle this, e.g. if poll() fails with ENOMEM. One
symptom of this issue is that we could sometimes see hang if a query is
canceled for some reasons.
The potential solution is that:
1) We expect each branch jump ("goto error_cleanup") should set proper error
code it self.
2) We add a "guard" function in the error_cleanup code to set an error code if
it is not set.
In general, the cleanup code in QD seems to be really obscure and not elegant.
Maybe we should file another JIRA to refactor the error handling logic in it.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)