You might also be interested in this benchmark I ran 3 months ago: https://docs.google.com/spreadsheet/pub?key=0Ao87IrzZJSaydENaem5USWg4TlRKcHl0dEtTS2NBOUE&output=html
J-D On Sat, Jun 29, 2013 at 12:13 PM, Varun Sharma <va...@pinterest.com> wrote: > Hi, > > I was doing some tests on how good HBase random reads are. The setup is > consists of a 1 node cluster with dfs replication set to 1. Short circuit > local reads and HBase checksums are enabled. The data set is small enough > to be largely cached in the filesystem cache - 10G on a 60G machine. > > Client sends out multi-get operations in batches to 10 and I try to measure > throughput. > > Test #1 > > All Data was cached in the block cache. > > Test Time = 120 seconds > Num Read Ops = 12M > > Throughput = 100K per second > > Test #2 > > I disable block cache. But now all the data is in the file system cache. I > verify this by making sure that IOPs on the disk drive are 0 during the > test. I run the same test with batched ops. > > Test Time = 120 seconds > Num Read Ops = 0.6M > Throughput = 5K per second > > Test #3 > > I saw all the threads are now stuck in idLock.lockEntry(). So I now run > with the lock disabled and the block cache disabled. > > Test Time = 120 seconds > Num Read Ops = 1.2M > Throughput = 10K per second > > Test #4 > > I re enable block cache and this time hack hbase to only cache Index and > Bloom blocks but data blocks come from File System cache. > > Test Time = 120 seconds > Num Read Ops = 1.6M > Throughput = 13K per second > > So, I wonder how come such a massive drop in throughput. I know that HDFS > code adds tremendous overhead but this seems pretty high to me. I use > 0.94.7 and cdh 4.2.0 > > Thanks > Varun