Hello,
I set cluster of 4 nodes with hdfs + hbase (1 node - namenode, hbase master, zookeeper and three remains with datanode + regionserver). I was trying to test maximal throughput of write operation - with yahoo ycbs. I got max 25000op/s while inserting 1KB data (1row with one column). Whats really strange for me is that i run same test with same configuration once with RF=1, then RF=3 on HDFS. The results was kind of same. Why? Replication should add overhead..., so how it could be same? Then I double cluster with 3 more datanodes and regionservers. I run same test and I got the same op/s like with 3 nodes. Why? Isnt the point of distributed DB to scale? so dobule nodes ~half double write speed? To omit any client bottleneck, I wrote simple inserter and was trying inserting from 3 different nodes into cluster, doesnt help (i pre-split regions. was trying to get evenly load, during monitoring it with jconsole the load was about 25:50:25). But the results was same like with one client. I found there is performance util with hbase, PerformanceEvaluation.java but i cant find any information how to compile it. Thanks for any ideas.
