This is slightly off-topic
There is a recent project called hadoop online (hop) on google-code
that promises a online/continuous query ability on top of hadoop which
should allow for near real time activities instead of the batch stuff
that mapred does
---
Sent from my phone
Ian Holsman - 703 879-3128
On 06/12/2009, at 3:12 PM, Joseph Bowman <[email protected]>
wrote:
When I wrote my Why Cassandra article, I didn't get into the why I
didn't choose x platform because I didn't want to start a flame war
by doing comparisons. For HBase, the primary reason I didn't choose
it is that while there were benchmarks of what it could
theoretically do, there wasn't any real real world deployments
proving it. My experience as a systems administrator is that it's
best to go with a product that's been proven over time in real world
scenarios.
I'll add to this though, that nothing nosql, even Cassandra, has
reached the point where I feel it's no-brainer to choose it over
anything, including sql based solutions like mysql and oracle. It
really comes down to your requirements.
On Sat, Dec 5, 2009 at 11:04 PM, Matt Revelle <[email protected]>
wrote:
On Dec 5, 2009, at 21:45, Joe Stump <[email protected]> wrote:
On Dec 5, 2009, at 7:41 PM, Bill Hastings wrote:
[Is] HBase used for real timish applications and if so any ideas
what the largest deployment is.
I don't know of anyone off the top of my head who's using anything
built on top of Hadoop for a real-time environment. Hadoop just
wasn't built for that. It was built, like MapReduce, for crunching
absurd amounts of data across hundreds of nodes in a "reasonable"
amount of time.
Just my $0.02.
--Joe
While Hadoop MapReduce isn't meant for realtime use, HBase can
handle it.
Over last summer there were some benchmarks included in HBase/Hadoop
presentations that showed, IIRC, performance comparable to Cassandra.