[ 
https://issues.apache.org/jira/browse/IMPALA-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499838#comment-17499838
 ] 

ASF subversion and git services commented on IMPALA-10049:
----------------------------------------------------------

Commit 719d3688950e9c22811ca2bb619b616f89679593 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=719d368 ]

IMPALA-10049: Include RPC call_id in slow RPC logs

KRPC log slow RPC trace in the receiver side. The trace log has the
call_id info that matches with the sender. However, our slow RPC logging
in the sender side does not log this call_id. It is hard to associate
the slow RPC logs between sender and receiver.

With the recent KRPC rebase in IMPALA-10931, we can now log the call_id
on the sender side.

Testing:
I tested this with a low threshold and delays added (the same as we did
in IMPALA-9128):

  start-impala-cluster.py \
      --impalad_args=--impala_slow_rpc_threshold_ms=1 \
      --impalad_args=--debug_actions=END_DATA_STREAM_DELAY:JITTER@[email protected]

The following is how the logs look like on the sender and receiver sides:

impalad_node1.INFO (sender):
I0217 10:29:36.278754  6606 krpc-data-stream-sender.cc:394] Slow TransmitData 
RPC (request call id 414) to 127.0.0.1:27002 
(fragment_instance_id=d8453c2785c38df4:3473e28b00000041): took 343.279ms. 
Receiver time: 342.780ms Network time: 498.405us

impalad_node2.INFO (receiver):
I0217 10:29:36.278379  6775 rpcz_store.cc:269] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:39702 (request call id 
414) took 342ms. Trace:
I0217 10:29:36.278479  6775 rpcz_store.cc:270] 0217 10:29:35.935586 (+     0us) 
impala-service-pool.cc:179] Inserting onto call queue
0217 10:29:36.277730 (+342144us) impala-service-pool.cc:278] Handling call
0217 10:29:36.277859 (+   129us) krpc-data-stream-recvr.cc:397] Deserializing 
batch
0217 10:29:36.278330 (+   471us) krpc-data-stream-recvr.cc:424] Enqueuing 
deserialized batch
0217 10:29:36.278369 (+    39us) inbound_call.cc:171] Queueing success response
Metrics: {}

Change-Id: I7fb5746fa0be575745a8e168405d43115c425389
Reviewed-on: http://gerrit.cloudera.org:8080/18243
Reviewed-by: Wenzhe Zhou <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Include RPC call_id in slow RPC logs
> ------------------------------------
>
>                 Key: IMPALA-10049
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10049
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>            Reporter: Sahil Takiar
>            Assignee: Riza Suminto
>            Priority: Major
>              Labels: observability
>
> The current code for logging slow RPCs on the sender side looks something 
> like this:
> {code:java}
> template <typename ResponsePBType>
> void KrpcDataStreamSender::Channel::LogSlowRpc(
>   ¦ const char* rpc_name, int64_t total_time_ns, const ResponsePBType& resp) {
>   int64_t network_time_ns = total_time_ns - resp_.receiver_latency_ns();
>   LOG(INFO) << "Slow " << rpc_name << " RPC to " << address_
>   ¦ ¦ ¦ ¦ ¦ << " (fragment_instance_id=" << PrintId(fragment_instance_id_) << 
> "): "
>   ¦ ¦ ¦ ¦ ¦ << "took " << PrettyPrinter::Print(total_time_ns, TUnit::TIME_NS) 
> << ". "
>   ¦ ¦ ¦ ¦ ¦ << "Receiver time: "
>   ¦ ¦ ¦ ¦ ¦ << PrettyPrinter::Print(resp_.receiver_latency_ns(), 
> TUnit::TIME_NS)
>   ¦ ¦ ¦ ¦ ¦ << " Network time: " << PrettyPrinter::Print(network_time_ns, 
> TUnit::TIME_NS);
> }
> void KrpcDataStreamSender::Channel::LogSlowFailedRpc(
>   ¦ const char* rpc_name, int64_t total_time_ns, const kudu::Status& err) {
>   LOG(INFO) << "Slow " << rpc_name << " RPC to " << address_
>   ¦ ¦ ¦ ¦ ¦ << " (fragment_instance_id=" << PrintId(fragment_instance_id_) << 
> "): "
>   ¦ ¦ ¦ ¦ ¦ << "took " << PrettyPrinter::Print(total_time_ns, TUnit::TIME_NS) 
> << ". "
>   ¦ ¦ ¦ ¦ ¦ << "Error: " << err.ToString();
> } {code}
> It would be nice to include the call_id in the logs as well so that RPCs can 
> more easily be traced. The RPC call_id is dumped in RPC traces on the 
> receiver side, as well as in the /rpcz output on the debug ui.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to