[
https://issues.apache.org/jira/browse/IMPALA-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772493#comment-16772493
]
Bikramjeet Vig commented on IMPALA-8202:
----------------------------------------
So far this is what i have found:
every query either unregisters or is canceled. If it unregisters, a log line is
printed and since it also calls CancelInternal, a cancel log line is also
printed with the same query_id. The test fails in teardown where it initiates
an explicit cancel, and the only way that query id does not exist is for that
query to already be canceled or unregistered. The only query_id that had 2 log
lines of Cancel() and one of Unregister() was
244ae481c70793b2:b5400e5300000000. Looking at the impalad Logs we find :
{noformat}
03:08:07.553206 23756 impala-server.cc:1142] UnregisterQuery():
query_id=244ae481c70793b2:b5400e5300000000
03:08:07.553220 23756 impala-server.cc:1249] Cancel():
query_id=244ae481c70793b2:b5400e5300000000
03:08:07.556561 29104 impala-server.cc:1249] Cancel():
query_id=244ae481c70793b2:b5400e5300000000
{noformat}
This means the second cancel was called very close to the first = 3341
microseconds. Now looking at how the test is written the thread that ran this
query periodically checks if the admission phase is passed, and it sleeps for
50000 microseconds before trying again. Also, the test log shows that it never
passed client.wait_for_admission_control(self.query_handle), this means the
thread never got a chance to wake up and finish handling the query, so it
should already be holding the thread lock. However, at the same time teardown()
somehow got hold of that thread's lock and tried canceling the query. Not sure
how this is possible unless python has some bug in its thread sync primitives.
> TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid
> or unknown query handle"
> ----------------------------------------------------------------------------------------------------
>
> Key: IMPALA-8202
> URL: https://issues.apache.org/jira/browse/IMPALA-8202
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.2.0
> Reporter: Andrew Sherman
> Assignee: Bikramjeet Vig
> Priority: Critical
>
> teardown() attempts to close each submission thread that was used. But one of
> them times out.
> {quote}
> 06:05:22 ERROR at teardown of
> TestAdmissionControllerStress.test_mem_limit[num_queries: 50 | protocol:
> beeswax | table_format: text/none | exec_option: {'batch_size': 0,
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen':
> False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} |
> submission_delay_ms: 50 | round_robin_submission: True]
> 06:05:22 custom_cluster/test_admission_controller.py:1004: in teardown
> 06:05:22 client.cancel(thread.query_handle)
> 06:05:22 common/impala_connection.py:183: in cancel
> 06:05:22 return
> self.__beeswax_client.cancel_query(operation_handle.get_handle())
> 06:05:22 beeswax/impala_beeswax.py:364: in cancel_query
> 06:05:22 return self.__do_rpc(lambda: self.imp_service.Cancel(query_id))
> 06:05:22 beeswax/impala_beeswax.py:512: in __do_rpc
> 06:05:22 raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> 06:05:22 E ImpalaBeeswaxException: ImpalaBeeswaxException:
> 06:05:22 E INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
> 06:05:22 E MESSAGE: Invalid or unknown query handle
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]