[ 
https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768629#comment-16768629
 ] 

Erik Krogen edited comment on HDFS-14277 at 2/14/19 7:18 PM:
-------------------------------------------------------------

Thanks for reporting this [~jojochuang]! Do you have a screenshot of the 
profiler showing issues with {{isCoordinatedCall()}}? This is an interesting 
hotspot...

I suspect that the results will be much different if we have a workload which 
is a mix of write and read requests. In your read-only benchmark, the Active is 
doing no write work, but the Observer is essentially trying to do a bunch of 
writes (via edit tailing) for no reason. Under a real mixed workload, both 
Active and Observer should be doing write work.


was (Author: xkrogen):
Thanks for reporting this [~jojochuang]! Do you have a screenshot of the 
profiler showing issues with {{isCoordinatedCall()}}? This is an interesting 
hotspot...

> [SBN read] Observer benchmark results
> -------------------------------------
>
>                 Key: HDFS-14277
>                 URL: https://issues.apache.org/jira/browse/HDFS-14277
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: ha, namenode
>    Affects Versions: 3.3.0
>         Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption, Cloudera Navigator.
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>         Attachments: observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled 
> cluster. Would like to share the results with the community. The cluster has 
> 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
> <namenode>  -op fileStatus -threads 100 -files 1000000 -useExisting 
> -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
>  * due to the edit tailing overhead, Observer node consume 30% CPU 
> utilization even if the cluster is idle.
>  * While Active NN has less than 1ms RPC processing time, Observer node has > 
> 5ms RPC processing time. I am still looking for the source of the longer 
> processing time. The longer RPC processing time may be the cause for the 
> performance degradation compared to that of Active NN. Note the cluster has 
> Cloudera Navigator installed which adds additional overhead to RPC processing 
> time.
>  * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top 
> hotspots in the profiler. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to