[
https://issues.apache.org/jira/browse/IMPALA-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190295#comment-17190295
]
Sahil Takiar commented on IMPALA-10139:
---------------------------------------
I think there is a similar issue with the TRACE logs. Take the example TRACE
above, the majority of the time the RPC was just in the deferred state - e.g.
there was not enough resources to process the RPC. Again, this just means that
the back-pressure mechanism was kicking in, not necessarily that the network
was slow.
> Slow RPC logs can be misleading
> -------------------------------
>
> Key: IMPALA-10139
> URL: https://issues.apache.org/jira/browse/IMPALA-10139
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Sahil Takiar
> Priority: Major
>
> The slow RPC logs added in IMPALA-9128 are based on the total time taken to
> successfully complete a RPC. The issue is that there are many reasons why an
> RPC might take a long time to complete. An RPC is considered complete only
> when the receiver has processed that RPC.
> The problem is that due to client-driven back-pressure mechanism, it is
> entirely possible that the receiver RPC does not process a receiver RPC
> because {{KrpcDataStreamRecvr::SenderQueue::GetBatch}} just hasn't been
> called yet (indirectly called by {{ExchangeNode::GetNext}}).
> This can lead to flood of slow RPC logs, even though the RPCs might not
> actually be slow themselves. What is worse is that the because of the
> back-pressure mechanism, slowness from the client (e.g. Hue users) will
> propagate across all nodes involved in the query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]