Sahil Takiar created IMPALA-10139:
-------------------------------------
Summary: Slow RPC logs can be misleading
Key: IMPALA-10139
URL: https://issues.apache.org/jira/browse/IMPALA-10139
Project: IMPALA
Issue Type: Improvement
Reporter: Sahil Takiar
The slow RPC logs added in IMPALA-9128 are based on the total time taken to
successfully complete a RPC. The issue is that there are many reasons why an
RPC might take a long time to complete. An RPC is considered complete only when
the receiver has processed that RPC.
The problem is that due to client-driven back-pressure mechanism, it is
entirely possible that the receiver RPC does not process a receiver RPC because
{{KrpcDataStreamRecvr::SenderQueue::GetBatch}} just hasn't been called yet
(indirectly called by {{ExchangeNode::GetNext}}).
This can lead to flood of slow RPC logs, even though the RPCs might not
actually be slow themselves. What is worse is that the because of the
back-pressure mechanism, slowness from the client (e.g. Hue users) will
propagate across all nodes involved in the query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]