[
https://issues.apache.org/jira/browse/IMPALA-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800987#comment-16800987
]
Joe McDonnell commented on IMPALA-8322:
---------------------------------------
Tracing from s3 run with [https://gerrit.cloudera.org/#/c/12832/] :
{noformat}
I0325 11:50:19.594966 31829 krpc-data-stream-mgr.cc:362] Sender 127.0.0.1 timed
out waiting for receiver fragment instance: 474a38fc210026e0:d8d388d500000000,
dest node: 2
I0325 11:50:19.595325 31829 rpcz_store.cc:265] Call
impala.DataStreamService.TransmitData from 127.0.0.1:52770 (request call id
9447) took 125425ms. Request Metrics: {"find_receiver_us":4}
I0325 11:50:19.595376 31829 krpc-data-stream-mgr.cc:362] Sender 127.0.0.1 timed
out waiting for receiver fragment instance: 474a38fc210026e0:d8d388d500000000,
dest node: 2
I0325 11:50:19.595402 31829 rpcz_store.cc:265] Call
impala.DataStreamService.TransmitData from 127.0.0.1:52784 (request call id
25823) took 125424ms. Request Metrics: {"find_receiver_us":10}
I0325 11:50:19.595419 31829 krpc-data-stream-mgr.cc:362] Sender 127.0.0.1 timed
out waiting for receiver fragment instance: 054bc34a530449e3:66124dd800000000,
dest node: 2
I0325 11:50:19.595434 31829 rpcz_store.cc:265] Call
impala.DataStreamService.TransmitData from 127.0.0.1:52770 (request call id
9446) took 125425ms. Request Metrics: {"find_receiver_us":14}
I0325 11:50:19.595453 31829 krpc-data-stream-mgr.cc:362] Sender 127.0.0.1 timed
out waiting for receiver fragment instance: 054bc34a530449e3:66124dd800000000,
dest node: 2
I0325 11:50:19.595468 31829 rpcz_store.cc:265] Call
impala.DataStreamService.TransmitData from 127.0.0.1:52784 (request call id
25822) took 125425ms. Request Metrics: {"find_receiver_us":7}
...
W0325 11:50:37.711663 31807 rpcz_store.cc:251] Call
impala.ControlService.CancelQueryFInstances from 127.0.0.1:53656 (request call
id 183) took 18114ms (client timeout 10000).
...
W0325 11:50:37.712100 31807 rpcz_store.cc:255] Trace:
0325 11:50:19.596826 (+ 0us) impala-service-pool.cc:165] Inserting onto call
queue
0325 11:50:19.596841 (+ 15us) impala-service-pool.cc:245] Handling call
0325 11:50:19.596879 (+ 38us) query-exec-mgr.cc:99] Found query
054bc34a530449e3:66124dd800000000
0325 11:50:37.711617 (+18114738us) query-state.cc:657] Cancelling
FragmentInstanceStates...
0325 11:50:37.711623 (+ 6us) krpc-data-stream-mgr.cc:328] Cancelling stream
KrpcDataStreamMgr 054bc34a530449e3:66124dd800000000
0325 11:50:37.711640 (+ 17us) krpc-data-stream-recvr.cc:591] Cancelling stream
fragment_instance_id=054bc34a530449e3:66124dd800000000 node_id=2
0325 11:50:37.711652 (+ 12us) inbound_call.cc:162] Queueing success response
Metrics: {"KrpcDataStreamMgr::lock_us":0,"WaitForPrepare_us":18114730}{noformat}
Tests still running, but I'm working on getting the logs.
> S3 tests encounter "timed out waiting for receiver fragment instance"
> ---------------------------------------------------------------------
>
> Key: IMPALA-8322
> URL: https://issues.apache.org/jira/browse/IMPALA-8322
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.3.0
> Reporter: Joe McDonnell
> Priority: Blocker
> Labels: broken-build
> Attachments: run_tests_swimlane.json.gz
>
>
> This has been seen multiple times when running s3 tests:
> {noformat}
> query_test/test_join_queries.py:57: in test_basic_joins
> self.run_test_case('QueryTest/joins', new_vector)
> common/impala_test_suite.py:472: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:699: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:174: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:183: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:360: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:381: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E ImpalaBeeswaxException: ImpalaBeeswaxException:
> E Query aborted:Sender 127.0.0.1 timed out waiting for receiver fragment
> instance: 6c40d992bb87af2f:0ce96e5d00000007, dest node: 4{noformat}
> This is related to IMPALA-6818. On a bad run, there are various time outs in
> the impalad logs:
> {noformat}
> I0316 10:47:16.359313 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1
> timed out waiting for receiver fragment instance:
> ef4a5dc32a6565bd:a8720b8500000007, dest node: 5
> I0316 10:47:16.359345 20175 rpcz_store.cc:265] Call
> impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id
> 14881) took 120182ms. Request Metrics: {}
> I0316 10:47:16.359380 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1
> timed out waiting for receiver fragment instance:
> d148d83e11a4603d:54dc35f700000004, dest node: 3
> I0316 10:47:16.359395 20175 rpcz_store.cc:265] Call
> impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id
> 14880) took 123097ms. Request Metrics: {}
> ... various messages ...
> I0316 10:47:56.364990 20154 kudu-util.h:108] Cancel() RPC failed: Timed out:
> CancelQueryFInstances RPC to 127.0.0.1:27000 timed out after 10.000s (SENT)
> ... various messages ...
> W0316 10:48:15.056421 20150 rpcz_store.cc:251] Call
> impala.ControlService.CancelQueryFInstances from 127.0.0.1:40912 (request
> call id 202) took 48695ms (client timeout 10000).
> W0316 10:48:15.056473 20150 rpcz_store.cc:255] Trace:
> 0316 10:47:26.361265 (+ 0us) impala-service-pool.cc:165] Inserting onto call
> queue
> 0316 10:47:26.361285 (+ 20us) impala-service-pool.cc:245] Handling call
> 0316 10:48:15.056398 (+48695113us) inbound_call.cc:162] Queueing success
> response
> Metrics: {}
> I0316 10:48:15.057087 20139 connection.cc:584] Got response to call id 202
> after client already timed out or cancelled{noformat}
> So far, this has only happened on s3. The system load at the time is not
> higher than normal. If anything it is lower than normal.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]