On Mon, Jan 10, 2011 at 11:12 AM, Weishung Chung <[email protected]> wrote:
> Multiple batches of 10k *new/updated* rows at any time to different tables
> by different clients simultaneously. I want these multiple batches of
> insertions to be done super fast. At the same time, I would like to be able
> to scale up to 100k rows at a time (the goal).  Now, I am building a cluster
> of size 6 to 7 nodes.

If you're writing a multi-threaded client and you're going to have
many clients like this writing to HBase continuously, I recommend
writing your application with asynchbase
(http://github.com/stumbleupon/asynchbase) instead.  It's an alternate
HBase client library I wrote and in my application it significantly
increased write throughput.  It can easily push 150k updates per
second to a 20-node cluster – and then it's the local machine that's
CPU bound, not the HBase cluster (the local machine is a very slow VM
so it doesn't have a lot of horsepower).  This client is especially
good for throughput oriented workloads and was written to be
thread-safe from the ground up (unlike HTable).

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Reply via email to