[
https://issues.apache.org/jira/browse/HDFS-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739801#comment-17739801
]
ASF GitHub Bot commented on HDFS-17042:
---------------------------------------
xinglin opened a new pull request, #5804:
URL: https://github.com/apache/hadoop/pull/5804
…
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
Backport of HDFS-17042 from trunk to branch-3.3. Almost clean cherry-pick
with a small conflict in RpcMetrics.java (_rpcRequeueCalls_ metric has not been
backported to branch-3.3 yet. Removed from this backport.).
### How was this patch tested?
```
mvn test -Dtest=TestRPC,TestMutableMetrics,TestProtoBufRpc
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.metrics2.lib.TestMutableMetrics
[INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
31.221 s - in org.apache.hadoop.metrics2.lib.TestMutableMetrics
[INFO] Running org.apache.hadoop.ipc.TestRPC
[INFO] Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
34.383 s - in org.apache.hadoop.ipc.TestRPC
[INFO] Running org.apache.hadoop.ipc.TestProtoBufRpc
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 8, Time elapsed:
7.261 s - in org.apache.hadoop.ipc.TestProtoBufRpc
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 63, Failures: 0, Errors: 0, Skipped: 8
```
> Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode
> ----------------------------------------------------------------------------
>
> Key: HDFS-17042
> URL: https://issues.apache.org/jira/browse/HDFS-17042
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.4.0, 3.3.9
> Reporter: Xing Lin
> Assignee: Xing Lin
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0
>
>
> We'd like to add two new types of metrics to the existing NN
> RpcMetrics/RpcDetailedMetrics. These two metrics can then be used as part of
> SLA/SLO for the HDFS service.
> * {_}RpcCallSuccesses{_}: it measures the number of RPC requests where they
> are successfully processed by a NN (e.g., with a response with an RpcStatus
> {_}RpcStatusProto.SUCCESS){_}{_}.{_} Then, together with {_}RpcQueueNumOps
> ({_}which refers the total number of RPC requests{_}){_}, we can derive the
> RpcErrorRate for our NN, as (RpcQueueNumOps - RpcCallSuccesses) /
> RpcQueueNumOps.
> * OverallRpcProcessingTime for each RPC method: this metric measures the
> overall RPC processing time for each RPC method at the NN. It covers the time
> from when a request arrives at the NN to when a response is sent back. We are
> already emitting processingTime for each RPC method today in
> RpcDetailedMetrics. We want to extend it to emit overallRpcProcessingTime for
> each RPC method, which includes enqueueTime, queueTime, processingTime,
> responseTime, and handlerTime.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]