I am attempting to configure HBase to maximize throughput, and have noticed
some bottlenecks. In particular, with my configuration, write performance is
well below theoretical throughput. I have a test program that inserts many rows
into a test table. Network I/O is less than 20% of max, and disk I/O is even
lower, maybe around 5% max on all boxes in the cluster. CPU is well below than
50% max on all boxes. I do not see any I/O waits or anything in particular than
raises concerns. I am using iostat and iftop to test throughput. To determine
theoretical max, I used dd and iperf. I have spent quite a bit of time
optimizing the HBase config parameters, optimizing GC, etc., and am familiar
with the HBase book online and such.