[ 
https://issues.apache.org/jira/browse/IMPALA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715661#comment-16715661
 ] 

Thomas Tauber-Marshall commented on IMPALA-7931:
------------------------------------------------

Yeah, that sounds reasonable. Definitely would be nice to turn 'cause' into a 
struct rather than plumbing through another arg.

Of course, its a little unfortunate to put the query on the queue to be 
cancelled only to not actually cancel it, but we probably don't want to add 
more work to MembershipCallback(). Of course, this can already happen (eg. if 
the query to be cancelled isn't running yet or has already finished), so should 
be fine as long as things are documented well.

> test_shutdown_executor fails with timeout waiting for query target state
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-7931
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7931
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 3.2.0
>            Reporter: Lars Volker
>            Assignee: Tim Armstrong
>            Priority: Critical
>              Labels: broken-build
>         Attachments: impala-7931-impalad-logs.tar.gz
>
>
> On a recent S3 test run test_shutdown_executor hit a timeout waiting for a 
> query to reach state FINISHED. Instead the query stays at state 5 (EXCEPTION).
> {noformat}
> 12:51:11 __________________ TestShutdownCommand.test_shutdown_executor 
> __________________
> 12:51:11 custom_cluster/test_restart_services.py:209: in 
> test_shutdown_executor
> 12:51:11     assert self.__fetch_and_get_num_backends(QUERY, 
> before_shutdown_handle) == 3
> 12:51:11 custom_cluster/test_restart_services.py:356: in 
> __fetch_and_get_num_backends
> 12:51:11     self.client.QUERY_STATES['FINISHED'], timeout=20)
> 12:51:11 common/impala_service.py:267: in wait_for_query_state
> 12:51:11     target_state, query_state)
> 12:51:11 E   AssertionError: Did not reach query state in time target=4 
> actual=5
> {noformat}
> From the logs I can see that the query fails because one of the executors 
> becomes unreachable:
> {noformat}
> I1204 12:31:39.954125  5609 impala-server.cc:1792] Query 
> a34c3a84775e5599:b2b25eb900000000: Failed due to unreachable impalad(s): 
> jenkins-worker:22001
> {noformat}
> The query was {{select count\(*) from functional_parquet.alltypes where 
> sleep(1) = bool_col}}. 
> It seems that the query took longer than expected and was still running when 
> the executor shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to