[
https://issues.apache.org/jira/browse/HDFS-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297483#comment-16297483
]
Erik Krogen commented on HDFS-12943:
------------------------------------
We have been running some performance experiments (using
[Dynamometer|https://lists.apache.org/thread.html/7223d22fbc26e055369695f83395e9a7767043f7245af25df385b535@%3Chdfs-dev.hadoop.apache.org%3E])
to try to determine just how large the potential benefits to be gained by this
feature are. Using the tool, we replayed a few hours of traces from a
production cluster against a simulated NameNode, filtering out different % of
read requests to mimic the ANN's point-of-view of requests going to the
standby. We tried filtering out 0%, 20%, 50%, and 100% of read requests, and
also tried replaying our write workload only at 2x and 4x speed to get an
estimate of throughput under the ideal (all reads offloaded) conditions.
|| ||0% Skip||20% Skip||50% Skip||100% Skip||100% Skip (2x)||100% Skip (4x)||
|Average Write Latency (ms)|52.8|28.5|18.0|14.0|27.0|73.2|
|Average Read Latency (ms)|34.3|20.0|11.5|N/A|N/A|N/A|
|RPC Queue AvgTime (ms)|23.0|11.9|7.4|1.7|4.3|20.7|
|RPC Queue 50th Percentile (ms)|2.81|0.52|0.47|0.05|0.05|0.04|
|RPC Queue 90th Percentile (ms)|24.42|12.51|9.98|0.12|1.49|12.96|
|RPC Queue NumOps (k)|31.0|25.2|16.3|1.5|3.0|6.0|
|LockQueueLength Average|45.3|24.9|18.9|7.0|12.5|30.6|
|GC Time (ms)|9.62|7.94|6.13|1.94|3.03|5.49|
The results above indicate that, if we were able to offload all read requests,
we should expect up to 4x throughput improvement for the write workload.
> Consistent Reads from Standby Node
> ----------------------------------
>
> Key: HDFS-12943
> URL: https://issues.apache.org/jira/browse/HDFS-12943
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: hdfs
> Reporter: Konstantin Shvachko
> Attachments: ConsistentReadsFromStandbyNode.pdf
>
>
> StandbyNode in HDFS is a replica of the active NameNode. The states of the
> NameNodes are coordinated via the journal. It is natural to consider
> StandbyNode as a read-only replica. As with any replicated distributed system
> the problem of stale reads should be resolved. Our main goal is to provide
> reads from standby in a consistent way in order to enable a wide range of
> existing applications running on top of HDFS.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]