Just read this article, "Solving Big Data Challenges for Enterprise Application Performance Management." published this month @ Volume 5, No.12 of Proceedings of the VLDB Endowment, where they measured 6 different databases - Project Voldemort, Redis, HBase, Cassandra, MySQL Cluster and VoltDB - with YCSB on two different kind of clusters, Memory-bound and Disk-bound, and I'm in doubt about results for HBase since:
* HBase version was 0.90.4 * Master nodes were deployed together with data nodes * They didn't reported tuning parameters There's also a paragraph where they reported that HBase failed frequently in non-deterministic ways while running YCSB. My intention with this e-mail is to look for opinions from you, who are more experienced with HBase, on where this experiment's setup could be changed to improve read operations, since in this setup HBase did not performed as well as Cassandra and Project Voldemort. Here's the article: http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf and Volume 5 home: http://vldb.org/pvldb/vol5.html
