[ 
https://issues.apache.org/jira/browse/IMPALA-10065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10065 started by Quanlong Huang.
-----------------------------------------------
> Hit DCHECK when retrying a query in FINISHED state
> --------------------------------------------------
>
>                 Key: IMPALA-10065
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10065
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> Queries will go into FINISHED state when rows are available, no matter 
> whether the client has fetched any results. If the client hasn't called fetch 
> on the query, the query should still be retryable. However, retrying such a 
> query hit a DCHECK at 
> https://github.com/apache/impala/blob/a0057788c5c2300f58b6615a27116b8331171e06/be/src/runtime/query-driver.cc#L131-L135
> This can be reproduce by modifying test_retries_from_cancellation_pool in 
> tests/customer_test/test_query_retry.py:
> {code}
> diff --git a/tests/custom_cluster/test_query_retries.py 
> b/tests/custom_cluster/test_query_retries.py
> index 54f2334..ae57068 100644
> --- a/tests/custom_cluster/test_query_retries.py
> +++ b/tests/custom_cluster/test_query_retries.py
> @@ -69,21 +69,23 @@ class TestQueryRetries(CustomClusterTestSuite):
>      # The following query executes slowly, and does minimal TransmitData 
> RPCs, so it is
>      # likely that the statestore detects that the impalad has been killed 
> before a
>      # TransmitData RPC has occurred.
> -    query = "select count(*) from functional.alltypes where bool_col = 
> sleep(50)"
> +    query = "select count(*) from functional.alltypestiny union all select 
> count(*) from functional.alltypes where bool_col = sleep(50)"
>  
>      # Launch the query, wait for it to start running, and then kill an 
> impalad.
>      handle = self.execute_query_async(query,
>          query_options={'retry_failed_queries': 'true'})
> -    self.wait_for_state(handle, self.client.QUERY_STATES['RUNNING'], 60)
> +    self.wait_for_state(handle, self.client.QUERY_STATES['FINISHED'], 60)
>  
>      # Kill a random impalad (but not the one executing the actual query).
>      self.__kill_random_impalad()
> +    time.sleep(10)
>  
>      # Validate the query results.
>      results = self.client.fetch(query, handle)
>      assert results.success
> -    assert len(results.data) == 1
> -    assert int(results.data[0]) == 3650
> +    assert len(results.data) == 2
> +    assert int(results.data[0]) == 8
> +    assert int(results.data[1]) == 3650
>  
>      # Validate the live exec summary.
>      retried_query_id = self.__get_retried_query_id_from_summary(handle)
> {code}
> The change choose another query that has two UNION operands. The query will 
> be in FINISHED state after the first operand finishes. When we kill an 
> impalad, the coordinator hit the DCHECK.
> We should support retrying a FINISHED (but actually running) query that 
> hasn't returned any results. This is required by IMPALA-9225.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to