[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient

Pranay Singh (JIRA) Mon, 03 Dec 2018 16:23:40 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708010#comment-16708010
 ]


Pranay Singh commented on HDFS-14084:
-------------------------------------

[[email protected]] I went through the code and found that the 
StorageStatistics is already implemented in DistributedFileSystem.java, but the 
problem with StorageStatistics is that it just records the frequency of a 
particular operation and not it's latency. It appears that though the stats are 
collected but they are only used by the test programs perhaps for verification 
purpose like TestDistributedFileSystem.java, StorageStatisticsTracker.java. The 
other problem is how to display it, I tried printing the stats through the 
Iterator on the log but that is too verbose.

[~elgoiri] I saw the usage of rpcMetrics(RpcMetrics) in Server.java, which has 
a function updateMetrics() that records the RPC latency via 
addRpcProcessingTime() which is called in ProtobufRpcEngine.java at the end RPC 
call (server side) perhaps similar implementation can be done in client side 
calling protobuf RPC methods. Please let me know your thoughts on that. This 
would use the generic Metric interface for publishing the result hence would be 
more usable.

 

> Need for more stats in DFSClient
> --------------------------------
>
>                 Key: HDFS-14084
>                 URL: https://issues.apache.org/jira/browse/HDFS-14084
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Pranay Singh
>            Assignee: Pranay Singh
>            Priority: Minor
>         Attachments: HDFS-14084.001.patch
>
>
> The usage of HDFS has changed from being used as a map-reduce filesystem, now 
> it's becoming more of like a general purpose filesystem. In most of the cases 
> there are issues with the Namenode so we have metrics to know the workload or 
> stress on Namenode.
> However, there is a need to have more statistics collected for different 
> operations/RPCs in DFSClient to know which RPC operations are taking longer 
> time or to know what is the frequency of the operation.These statistics can 
> be exposed to the users of DFS Client and they can periodically log or do 
> some sort of flow control if the response is slow. This will also help to 
> isolate HDFS issue in a mixed environment where on a node say we have Spark, 
> HBase and Impala running together. We can check the throughput of different 
> operation across client and isolate the problem caused because of noisy 
> neighbor or network congestion or shared JVM.
> We have dealt with several problems from the field for which there is no 
> conclusive evidence as to what caused the problem. If we had metrics or stats 
> in DFSClient we would be better equipped to solve such complex problems.
> List of jiras for reference:
> -------------------------
>  HADOOP-15538 HADOOP-15530 ( client side deadlock)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient

Reply via email to