Thanks. That is interesting and what I was looking for.
I knew V.20 was closing the gap. Probably good to compare with V0.5B1
on the Cassandra side. I'd think that fast multi-get and batch insert/
update would be interesting to compare and benchmark. I know we are
taxing Cassandra now and working on some auxillary means (outside if
Thrift) to see what the per node limits really are...
Sent from my iPhone
On Dec 6, 2009, at 12:35 AM, "Matt Revelle" <[email protected]> wrote:
Cassandra performance likely still beats HBase, but according to the
"Powered By" page on the HBase wiki it is being used to handle
realtime requests by StumbleUpon, Meetup, and Streamy (http://wiki.apache.org/hadoop/Hbase/PoweredBy
).
These two documents contain some performance numbers:
http://static.last.fm/johan/nosql-20090611/hbase_nosql.pdf (skip to
page 22)
http://www.slideshare.net/schubertzhang/hbase-0200-performance-evaluation
Both Cassandra and HBase are useful tech, I just wanted to point out
that HBase performance has improved over the past year and it can
handle realtime requests.
On Dec 5, 2009, at 11:08 PM, Tim Estes wrote:
Can you link/reference those? I haven't seen random read or write
performance numbers published around V0.20 Hbase that are within 5x
of Cassandra. I'm very curious about this...
Sent from my iPhone
On Dec 5, 2009, at 11:05 PM, "Matt Revelle" <[email protected]>
wrote:
On Dec 5, 2009, at 21:45, Joe Stump <[email protected]> wrote:
On Dec 5, 2009, at 7:41 PM, Bill Hastings wrote:
[Is] HBase used for real timish applications and if so any ideas
what the largest deployment is.
I don't know of anyone off the top of my head who's using
anything built on top of Hadoop for a real-time environment.
Hadoop just wasn't built for that. It was built, like MapReduce,
for crunching absurd amounts of data across hundreds of nodes in
a "reasonable" amount of time.
Just my $0.02.
--Joe
While Hadoop MapReduce isn't meant for realtime use, HBase can
handle it.
Over last summer there were some benchmarks included in HBase/
Hadoop presentations that showed, IIRC, performance comparable to
Cassandra.