Performance test results

Eran Kutner Mon, 28 Mar 2011 03:17:09 -0700

Hi,
I'm running some performance tests on a cluster with 5 member servers
(not counting the masters of all kinds), each node running a data
node, a region server and a thrift server.
Each server has 2 quad core CPUs and 16GB of RAM. The data set I'm
using is built of 50 sets of consecutive keys with 100 columns in each
row all under a single CF. Each column has 1KB of random data. The
entire data set is around 40GB so it should fit in RAM for caching
purposes. I've enabled LZO for the table.
I'm running the test through Thrift because that's the way it is going
to be used in our production environment.


I started with a basic insert operation. Inserting rows with one
column with 1KB of data each.
Initially, when the table was empty I was getting around 300 inserts
per second with 50 writing threads. Then, when the region split and a
second server was added the rate suddenly jumped to 3000 inserts/sec
per server, so ~6000 for the two servers. Over time as more servers
were added the rate actually went down, and stabilized on around 2000
inserts/sec per server.

I'm not sure what is the reason for the jump on inserts per server
after the region was split? Maybe local splits on the same RS that
allowed more open files?

I also conducted a random column read test, where I read different
number of columns from randomly selected rows. First I tested reading
only one specific column (the first in each row). It started at around
60r/s  per server and gradually (I assume as more data was loaded into
the cache)  increased to ~800 r/s per server. When reading 5 random
columns from each row the rate dropped to around 400 rows/sec and when
fetching full rows (each with 100 columns) the rate remained about the
same, at 400 rows/sec per server.

I'm not sure exactly what should I expect but I was hoping for much
higher numbers. I read somewhere that for small data it is reasonable
to expect 10K inserts per core per second. I know 1KB isn't small but
these are 8 core machines and they are doing about 2K inserts. Also
the read rate is very low considering all the data should fit in RAM.
The interesting thing is that there doesn't seem to be any resource
bottleneck. IO utilization on the servers is negligible and CPU is
around 40-50% utilization. The client generating the load is not
loaded either (around 5% CPU utilization). Client network was at 30%
utilization when reading full rows. So the only reason for flat-lining
is some sort of lock contention. Does this make sense?

Any way to improve it, especially the read performance?

-eran

Performance test results

Reply via email to