[ https://issues.apache.org/jira/browse/HDFS-17042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HDFS-17042: ---------------------------------- Labels: pull-request-available (was: ) > Add rpcCallSuccesses and OverallRpcProcessingTime to RpcMetrics for Namenode > ---------------------------------------------------------------------------- > > Key: HDFS-17042 > URL: https://issues.apache.org/jira/browse/HDFS-17042 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs > Affects Versions: 3.4.0, 3.3.9 > Reporter: Xing Lin > Assignee: Xing Lin > Priority: Major > Labels: pull-request-available > > We'd like to add two new types of metrics to the existing NN > RpcMetrics/RpcDetailedMetrics. These two metrics can then be used as part of > SLA/SLO for the HDFS service. > * {_}RpcCallSuccesses{_}: it measures the number of RPC requests where they > are successfully processed by a NN (e.g., with a response with an RpcStatus > {_}RpcStatusProto.SUCCESS){_}{_}.{_} Then, together with {_}RpcQueueNumOps > ({_}which refers the total number of RPC requests{_}){_}, we can derive the > RpcErrorRate for our NN, as (RpcQueueNumOps - RpcCallSuccesses) / > RpcQueueNumOps. > * OverallRpcProcessingTime for each RPC method: this metric measures the > overall RPC processing time for each RPC method at the NN. It covers the time > from when a request arrives at the NN to when a response is sent back. We are > already emitting processingTime for each RPC method today in > RpcDetailedMetrics. We want to extend it to emit overallRpcProcessingTime for > each RPC method, which includes enqueueTime, queueTime, processingTime, > responseTime, and handlerTime. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org