[ 
https://issues.apache.org/jira/browse/HDFS-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154877#comment-16154877
 ] 

Kai Zheng commented on HDFS-1068:
---------------------------------

Hi [~zhz],

The idea to optimize {{getFileInfo}} sounds great to me, since it can be a 
frequent call. Trying to reuse the Java objects is good as you said. Is it 
possible to go further, like caching some results and returning some of them 
directly when hit? I thought of this because getFileInfo is a readonly call and 
many files remain unchanged in most cases.

> Reduce NameNode GC by reusing HdfsFileStatus objects in RPC handlers
> --------------------------------------------------------------------
>
>                 Key: HDFS-1068
>                 URL: https://issues.apache.org/jira/browse/HDFS-1068
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Hairong Kuang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-1068.00.patch, Screen Shot 2017-08-31 at 3.58.15 
> PM.png
>
>
> In our production clusters, getFileInfo is the most frequent operation that 
> hit NameNode, and its frequency is highly correlated to the GC behavior. 
> HDFS-946 has already reduced the amount of heap/cpu and the number of 
> temporary objects for each getFileInfo call. Yet another improvement is to 
> avoid creation of a HdfsFileStatus object for each getFileInfo call. Instead 
> each RPC handler can have a thread local HdfsFileStatus object. Each 
> getFileInfo call simply sets values for all fields of the thread local 
> HdfsFileStatus object. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to