[ 
https://issues.apache.org/jira/browse/HDFS-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149741#comment-16149741
 ] 

Zhe Zhang commented on HDFS-1068:
---------------------------------

Reviving this ticket since GC from RPCs is actually a major contributor to 
NameNode slowness, see comparison of GC workload between active and standby 
NameNode on same cluster in attached figure.

Also renaming it to reflect the fact that the optimization should be done for 
both {{getFileInfo}} and {{listStatus}}.

Sorry to bug existing watchers of the ticket. Also ping [~andrew.wang] 
[~drankye] [~daryn] on the idea of reusing {{HdfsFileStatus}} objects in RPC 
handler threads.

> Reduce NameNode GC by reusing HdfsFileStatus objects in RPC handlers
> --------------------------------------------------------------------
>
>                 Key: HDFS-1068
>                 URL: https://issues.apache.org/jira/browse/HDFS-1068
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Hairong Kuang
>            Assignee: Zhe Zhang
>         Attachments: Screen Shot 2017-08-31 at 3.58.15 PM.png
>
>
> In our production clusters, getFileInfo is the most frequent operation that 
> hit NameNode, and its frequency is highly correlated to the GC behavior. 
> HDFS-946 has already reduced the amount of heap/cpu and the number of 
> temporary objects for each getFileInfo call. Yet another improvement is to 
> avoid creation of a HdfsFileStatus object for each getFileInfo call. Instead 
> each RPC handler can have a thread local HdfsFileStatus object. Each 
> getFileInfo call simply sets values for all fields of the thread local 
> HdfsFileStatus object. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to