[ 
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711876#comment-16711876
 ] 

Tim Armstrong commented on IMPALA-7931:
---------------------------------------

I think the fix would be to check whether the query found via query_locations_ 
is actually executing on that backend before killing the query.

> test_shutdown_executor fails with timeout waiting for query target state
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-7931
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7931
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 3.2.0
>            Reporter: Lars Volker
>            Assignee: Tim Armstrong
>            Priority: Critical
>              Labels: broken-build
>         Attachments: impala-7931-impalad-logs.tar.gz
>
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor 
> __________________
> 12:51:11 custom_cluster/test_restart_services.py:209: in 
> test_shutdown_executor
> 12:51:11     assert self.__fetch_and_get_num_backends(QUERY, 
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in 
> __fetch_and_get_num_backends
> 12:51:11     self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11     target_state, query_state)
> 12:51:11 E   AssertionError: Did not reach query state in time target=4 
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors 
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
> a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s): 
> jenkins-worker:22001
> {noformat}
> The query was {{select count\(*) from functional_parquet.alltypes where 
> sleep(1) = bool_col}}. 
> It seems that the query took longer than expected and was still running when 
> the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to