[ 
https://issues.apache.org/jira/browse/HDFS-14276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768627#comment-16768627
 ] 

Erik Krogen commented on HDFS-14276:
------------------------------------

Hey [~jojochuang], great find, thanks for reporting. Do you have new profiler 
results after your patch?

More generally, regarding high CPU usage at idle, I have one idea for reducing 
this. When a cluster is heavily loaded, it makes sense to reduce the edit tail 
period as low as possible to keep the observer in sync with the active. But 
when a cluster or idle or lightly loaded, there is no need for this. Perhaps we 
can add some logic to decrease the edit tail frequency when the 
Observer/Standby get empty responses from the JNs.

> [SBN read] Reduce tailing overhead
> ----------------------------------
>
>                 Key: HDFS-14276
>                 URL: https://issues.apache.org/jira/browse/HDFS-14276
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ha, namenode
>    Affects Versions: 3.3.0
>         Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption.
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>         Attachments: HDFS-14276.000.patch, Screen Shot 2019-02-12 at 10.51.41 
> PM.png
>
>
> When Observer setsĀ {{dfs.ha.tail-edits.period}} = {{0ms}}, it tails edit log 
> continuously in order to fetch the latest edits, but there is a lot of 
> overhead in doing so.
> Critically, edit log tailer should _not_ update NameDirSize metric every 
> time. It has nothing to do with fetching edits, and it involves lots of 
> directory space calculation.
> Profiler suggests a non-trivial chunk of time is spent for nothing.
> Other than this, the biggest overhead is in the communication to 
> serialize/deserialize messages to/from JNs. I am looking for ways to reduce 
> the cost because it's burning 30% of my CPU time even when the cluster is 
> idle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to