Hi folks, We have been conducting a performance study comparing Cassandra and HBase (and Yahoo! PNUTS and MySQL) on identical hardware under identical workloads. Our focus has been on serving workloads (e.g. read and write individual records, rather than scan a whole table for MapReduce.) This is part of a larger effort to develop a benchmark for these kinds of systems (which we are calling YCSB, or the Yahoo Cloud Serving Benchmark.)
I thought this list might be interested in the first set of results we have. We submitted a paper on these results, and the benchmark as a whole, and we are continuing to benchmark other scenarios and systems. But we have produced a snapshot of the results if you are interested: High level summary: http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf Detailed paper: http://www.brianfrankcooper.net/pubs/ycsb.pdf In general, Cassandra performs quite well, with good throughput and latency compared to PNUTS (which we call Sherpa internally) and better throughput than HBase. I'd be happy to answer any questions about the results or discuss possible ways to tune Cassandra. We had already received extensive tuning help from this list last year (thanks!) but more suggestions are always helpful. The benchmark tool will be open sourced real soon now (we are just waiting for final approval from Yahoo legal) and our hope is that it is a useful tool for apples-to-apples comparison of different systems. Brian -- Brian Cooper Principal Research Scientist Yahoo! Research