[ https://issues.apache.org/jira/browse/HAWQ-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Guo reassigned HAWQ-1334: ------------------------------ Assignee: Paul Guo (was: Ed Espino) > QD thread should set error code if failing so that the main process for the > query could exit soon > ------------------------------------------------------------------------------------------------- > > Key: HAWQ-1334 > URL: https://issues.apache.org/jira/browse/HAWQ-1334 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher > Reporter: Paul Guo > Assignee: Paul Guo > > In QD thread function dispmgt_thread_func_run(), if there are failures either > due to QE or QD itself, it will cancel the query and then clean up. The main > process for the query needs the error code of meleeResults be set so that it > soon proceeds to cancel the query, else we have to wait for timeout. > Typically dispmgt_thread_func_run() should set the error code, however I > found there are some cases who do not handle this, e.g. if poll() fails with > ENOMEM. One symptom of this issue is that we could sometimes see hang if a > query is canceled for some reasons. > The potential solution is that: > 1) We expect each branch jump ("goto error_cleanup") set proper error code > itself. It is not an easy job. > 2) We add a "guard" function in the error_cleanup code to set an error code > if it is not set, i.e. 1) is not well done. > I'd this JIRA cares about 2). > In general, the cleanup code in QD seems to be really obscure and not > elegant. Maybe we should file another JIRA to refactor the error handling > logic in it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)