Hey folks,

We at LinkedIn have been working for a while on a scale testing and performance 
evaluation tool for HDFS and particularly the NameNode, which we call 
Dynamometer. It is now open source: you can view it on our GitHub page [1], and 
read about our motivations and design in our blog post [2]. The Dynamometer 
framework sets up a NameNode and DataNodes inside of YARN containers to create 
a full-scale HDFS cluster, just without any actual data, and then starts a 
MapReduce job which is used to replay audit log traces to generate realistic 
load. We’ve been using this internally for quite a while and have found it to 
be very useful for verifying changes before they go live on our production 
clusters, quantifying the performance of releases, and investigating the 
performance implications of potential patches. We hope that you will all find 
it useful as well and invite your contributions and feedback.

Thanks,
Erik Krogen
HDFS @ LinkedIn

[1]: https://github.com/linkedin/dynamometer
[2]: 
https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to