[
https://issues.apache.org/jira/browse/HDFS-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772780#comment-16772780
]
Tao Jie commented on HDFS-14297:
--------------------------------
Thank you [~xkrogen], {{getContentSummary}} is invoked from several peripheral
systems, not only for monitoring quotas in our environment. We can replace
{{getContentSummary}} by {{getQuotaUsage}} in some place. I still think we
should do some improvement on server side. If we have a new user who call
{{getContentSummary}} very frequently, it will cause a lot of load to namenode
rpc server
> Add cache for getContentSummary() result
> ----------------------------------------
>
> Key: HDFS-14297
> URL: https://issues.apache.org/jira/browse/HDFS-14297
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Tao Jie
> Priority: Major
>
> In a large HDFS cluster, calling {{getContentSummary}} for a directory with
> large amount of files is very expensive. In a certain cluster with more than
> 100 million files, calling {{getContentSummary}} may take more than 10s and
> it will hold fsnamesystem lock for such a long time.
> In our cluster, there are several peripheral systems calling
> {{getContentSummary}} periodically to monitor the status of dirs. Actually we
> don't need the very accurate result in most cases. We could keep a cache for
> those contentSummary result in namenode, with which we could avoid repeated
> heavy request in a span. Also we should add more restrictions to this cache:
> 1,its size should be limited and it should be LRU, 2, only result of heavy
> request would be added to this cache, eg, rpctime over 1000ms.
> We may create a new RPC method or add a flag to the current method so that we
> will not modify the current behavior and we can have a choose of a accurate
> but expensive method or a fast but inaccurate method.
> Any thought?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]