Re: update on results from vldb

Jonathan Gray Wed, 02 Sep 2009 13:43:53 -0700

Sounds like a very interesting experiment! We'd be happy to help youtweak and optimize your HBase installation to achieve peak performance.

But first off, 0.19.0 is an outdated and unsupported version. At theleast, please upgrade to 0.19.3. However, the upcoming 0.20.0 releaseincludes a large number of performance-focused improvements. There willbe an RC3 released this week. It is known to be quite stable so I wouldstrongly recommend that you run tests on RC3. A final release isexpected shortly thereafter.


Thanks, and keep us updated!

JG

Adam Silberstein wrote:

Hi,

For those interested, I want to tell the HBase community more about the
experimental results Raghu Ramakrishnan presented at VLDB last week, and
where we in Yahoo! Research are going from here.

First, our results from all systems were very preliminary, and Raghu
emphasized that.  We deployed each with essentially their out-of-the-box

configurations and ran a series of simple experiments.

I'll briefly outline the experiments, and please let me know if you want
more details.  We deployed HBase 0.19.0 with 1 master server and 6
region servers, with 6 GB heap space to each machine.  We loaded
120,000,000 1KB records, where each record is essentially one column of
just random bytes.  We then ran a series of experiments where we set a
target read/update ratio, and then measured actual throughput and per-op
latency.  In our setting, an update is actually a read+write, and the
update overwrites the entire record.  We used two ratios: 50% read/50%
update, and 95% read/5% update.  The target throughputs (across 6 region
servers) were 600,1200, 2400, 3600, 4800, 6000.  Generally, we used 50
parallel clients to reach the target throughput, but also went up to 100
or 200 clients if necessary.  We ran each experiment targeted for 30
minutes, but allowed it to run longer if the system could not meet the

target throughput.

We are now taking on the effort of expanding our benchmark beyond these
2 simple workloads to capture the key use cases for key-value stores.
One of our goals is to release the benchmark and the harness for running
it.  Another of our goals is to run our expanded benchmark over against
the systems we have already, and perhaps more.  While we plan to
eventually publish our results in a conference or journal, I want to
emphasize we will first circulate of draft of our findings to give the
various communities a chance to comment, make suggestions, tell us if

our results look way off, etc.

Thanks for your interest!

-Adam

Re: update on results from vldb

Reply via email to