[
https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768629#comment-16768629
]
Erik Krogen edited comment on HDFS-14277 at 2/14/19 7:18 PM:
-------------------------------------------------------------
Thanks for reporting this [~jojochuang]! Do you have a screenshot of the
profiler showing issues with {{isCoordinatedCall()}}? This is an interesting
hotspot...
I suspect that the results will be much different if we have a workload which
is a mix of write and read requests. In your read-only benchmark, the Active is
doing no write work, but the Observer is essentially trying to do a bunch of
writes (via edit tailing) for no reason. Under a real mixed workload, both
Active and Observer should be doing write work.
was (Author: xkrogen):
Thanks for reporting this [~jojochuang]! Do you have a screenshot of the
profiler showing issues with {{isCoordinatedCall()}}? This is an interesting
hotspot...
> [SBN read] Observer benchmark results
> -------------------------------------
>
> Key: HDFS-14277
> URL: https://issues.apache.org/jira/browse/HDFS-14277
> Project: Hadoop HDFS
> Issue Type: Task
> Components: ha, namenode
> Affects Versions: 3.3.0
> Environment: Hardware: 4-node cluster, each node has 4 core, Xeon
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL,
> RPC encryption + Data Transfer Encryption, Cloudera Navigator.
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
> Priority: Major
> Attachments: observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled
> cluster. Would like to share the results with the community. The cluster has
> 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs
> <namenode> -op fileStatus -threads 100 -files 1000000 -useExisting
> -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
> * due to the edit tailing overhead, Observer node consume 30% CPU
> utilization even if the cluster is idle.
> * While Active NN has less than 1ms RPC processing time, Observer node has >
> 5ms RPC processing time. I am still looking for the source of the longer
> processing time. The longer RPC processing time may be the cause for the
> performance degradation compared to that of Active NN. Note the cluster has
> Cloudera Navigator installed which adds additional overhead to RPC processing
> time.
> * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top
> hotspots in the profiler.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]