[
https://issues.apache.org/jira/browse/IMPALA-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155624#comment-17155624
]
ASF subversion and git services commented on IMPALA-9834:
---------------------------------------------------------
Commit 70c2073d02675ffc64b09335e6c3a2744bc6d961 in impala's branch
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=70c2073 ]
IMPALA-9834: De-flake TestQueryRetries on EC builds
This patch skips all tests in TestQueryRetries on EC builds.
The tests in TestQueryRetries runs queries that run on three instances
during regular builds (HDFS, S3, etc.), but only two instances on EC
builds. This causes some non-deterministism during the test because
killing an impalad in the mini-cluster won't necessarily cause a retry
to be triggered.
It bumps up the timeout used when waiting for a query to be retried.
It improves the assertion in __get_query_id_from_profile so that it
dumps the full profile when the assertion fails. This should help
debuggability of any test failures that fail in this assertion.
Testing:
* Ran TestQueryRetries locally
Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71
Reviewed-on: http://gerrit.cloudera.org:8080/16149
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> test_query_retries.TestQueryRetries is flaky on erasure coding configurations
> -----------------------------------------------------------------------------
>
> Key: IMPALA-9834
> URL: https://issues.apache.org/jira/browse/IMPALA-9834
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.0
> Reporter: Joe McDonnell
> Assignee: Sahil Takiar
> Priority: Blocker
> Labels: broken-build, flaky
>
> Multiple tests from test_query_retries.TestQueryRetries hit errors like this
> (test_retry_query_cancel):
> {noformat}
> custom_cluster/test_query_retries.py:321: in test_retry_query_cancel
> self.__validate_runtime_profiles_from_service(impalad_service, handle)
> custom_cluster/test_query_retries.py:435: in
> __validate_runtime_profiles_from_service
> self.__validate_runtime_profiles(retried_profile, handle.get_handle().id)
> custom_cluster/test_query_retries.py:503: in __validate_runtime_profiles
> retried_query_id =
> self.__get_query_id_from_profile(retried_runtime_profile)
> custom_cluster/test_query_retries.py:474: in __get_query_id_from_profile
> assert query_id_search, "Invalid query profile, has no query id"
> E AssertionError: Invalid query profile, has no query id
> E assert None{noformat}
> Or this (test_kill_impalad_expect_retries, test_kill_impalad_expect_retry,
> test_retry_query_hs2):
> {noformat}
> custom_cluster/test_query_retries.py:424: in test_retry_query_hs2
> self.hs2_client.get_query_id(handle))
> custom_cluster/test_query_retries.py:508: in __validate_runtime_profiles
> original_query_id)
> custom_cluster/test_query_retries.py:489: in __validate_original_id_in_profile
> assert original_id_search, \
> E AssertionError: Could not find original id pattern 'Original Query Id:
> (.*)' in profile:
> ...{noformat}
> I have only seen these errors on erasure coding so far, and it isn't
> deterministic.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]