Hi folks,

We have been conducting a performance study comparing Cassandra and HBase (and 
Yahoo! PNUTS and MySQL) on identical hardware under identical workloads. Our 
focus has been on serving workloads (e.g. read and write individual records, 
rather than scan a whole table for MapReduce.) This is part of a larger effort 
to develop a benchmark for these kinds of systems (which we are calling YCSB, 
or the Yahoo Cloud Serving Benchmark.)

I thought this list might be interested in the first set of results we have. We 
submitted a paper on these results, and the benchmark as a whole, and we are 
continuing to benchmark other scenarios and systems. But we have produced a 
snapshot of the results if you are interested:

High level summary: http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf
Detailed paper: http://www.brianfrankcooper.net/pubs/ycsb.pdf

In general, Cassandra performs quite well, with good throughput and latency 
compared to PNUTS (which we call Sherpa internally) and better throughput than 
HBase.

I'd be happy to answer any questions about the results or discuss possible ways 
to tune Cassandra. We had already received extensive tuning help from this list 
last year (thanks!) but more suggestions are always helpful.

The benchmark tool will be open sourced real soon now (we are just waiting for 
final approval from Yahoo legal) and our hope is that it is a useful tool for 
apples-to-apples comparison of different systems.

Brian

--
Brian Cooper
Principal Research Scientist
Yahoo! Research

Reply via email to