Hey Eran, Glad you could go back to debugging performance :)
The scalability issues you are seeing are unknown to me, it sounds like the client isn't pushing it enough. It reminded me of when we switched to using the native Thrift PHP extension instead of the "normal" one and we saw huge speedups. My limited knowledge of Thrift may be blinding me, but I looked around for C# Thrift performance issues and found threads like this one http://www.mail-archive.com/[email protected]/msg00320.html As you didn't really debug the speed of Thrift itself in your setup, this is one more variable in the problem. Also you don't really provide metrics about your system apart from requests/second. Would it be possible for you set them up using this guide? http://hbase.apache.org/metrics.html J-D On Thu, Apr 21, 2011 at 5:13 AM, Eran Kutner <eran@> wrote: > Hi J-D, > After stabilizing the configuration, with your great help, I was able > to go back to the the load tests. I tried using IRC, as you suggested, > to continue this discussion but because of the time difference (I'm > GMT+3) it is quite difficult to find a time when people are present > and I am available to run long tests, so I'll give the mailing list > one more try. > > I tested again on a clean table using 100 insert threads each, using a > separate keyspace within the test table. Every row had just one column > with 128 bytes of data. > With one server and one region I got about 2300 inserts per second. > After manually splitting the region I got about 3600 inserts per > second (still on one machine). After a while the regions were balanced > and one was moved to another server, that got writes to around 4500 > writes per second. Additional splits and moves to more servers didn't > improve this number and the write performance stabilized at ~4000 > writes/sec per server. This seems pretty low, especially considering > other numbers I've seen around here. > > Read performance is at around 1500 rows per second per server, which > seems extremely low to me, especially considering that all the working > set I was querying could fit in the servers memory. To make the test > interesting I limited my client to fetch only 1 row (always the same > one) from each keyspace, that yielded 10K reads per sec per server, so > I tried increasing the range again a read the same 10 rows, now the > performance dropped to 8500 reads/sec per server. Increasing the range > to 100 rows and the performance drops to around 3500 reads per second > per server. > Do you have any idea what could explain this behavior and how do I get > a decent number of reads from those servers? > > -eran
