Hey Brian, We use HBase to complement MySQL in serving activity-stream type data here at Meetup. It's handling real-time requests involved in 20-25% of our page views, but our latency requirements aren't as strict as yours. For what it's worth, I did a presentation on our setup which will hopefully fill in some details: http://www.slideshare.net/ghelmling/hbase-at-meetup
There are also some great presentations by Ryan Rawson and Jonathan Gray on how they've used HBase for realtime serving on their sites. See the presentations wiki page: http://wiki.apache.org/hadoop/HBase/HBasePresentations Like Barney, I suspect where you'll hit some issues will be in your latency requirements. Depending on how you layout your data and configure your column families, your average latency may be good, but you will hit some pauses as I believe reads block at times during region splits or compactions and memstore flushes (unless you have a fairly static data set). Others here should be able to fill in more details. With a relatively small dataset, you may want to look at the "in memory" configuration option for your column families. What's your expected workload -- writes vs. reads? types of reads you'll be doing: random access vs. sequential? There are a lot of knowledgeable folks here to offer advice if you can give us some more insight into what you're trying to build. --gh On Tue, Mar 9, 2010 at 11:21 AM, jaxzin <brian.r.jack...@espn3.com> wrote: > > This is exactly the kind of feedback I'm looking for thanks, Barney. > > So its sounds like you cache the data you get from HBase in a session-based > memory? Are you using a Java EE HttpSession? (I'm less familiar with > django/rails equivalent but I'm assuming they exist) Or are you using a > memory cache provider like ehcache or memcache(d)? > > Can you tell me more about your experience with latency and why you say > that? > > > Barney Frank wrote: > > > > I am using Hbase to store visitor level clickstream-like data. At the > > beginning of the visitor session I retrieve all the previous session data > > from hbase and use it within my app server and massage it a little and > > serve > > to the consumer via web services. Where I think you will run into the > > most > > problems is your latency requirement. > > > > Just my 2 cents from a user. > > > > On Tue, Mar 9, 2010 at 9:45 AM, jaxzin <brian.r.jack...@espn3.com> > wrote: > > > >> > >> Hi all, I've got a question about how everyone is using HBase. Is > anyone > >> using its as online data store to directly back a web service? > >> > >> The text-book example of a weblink HBase table suggests there would be > an > >> associated web front-end to display the information in that HBase table > >> (ex. > >> search results page), but I'm having trouble finding evidence that > anyone > >> is > >> servicing web traffic backed directly by an HBase instance in practice. > >> > >> I'm evaluating if HBase would be the right tool to provide a few things > >> for > >> a large-scale web service we want to develop at ESPN and I'd really like > >> to > >> get opinions and experience from people who have already been down this > >> path. No need to reinvent the wheel, right? > >> > >> I can tell you a little about the project goals if it helps give you an > >> idea > >> of what I'm trying to design for: > >> > >> 1) Highly available (It would be a central service and an outage would > >> take > >> down everything) > >> 2) Low latency (1-2 ms, less is better, more isn't acceptable) > >> 3) High throughput (5-10k req/sec at worse case peak) > >> 4) Unstable traffic (ex. Sunday afternoons during football season) > >> 5) Small data...for now (< 10 GB of total data currently, but HBase > could > >> allow us to design differently and store more online) > >> > >> The reason I'm looking at HBase is that we've solved many of our scaling > >> issues with the same basic concepts of HBase (sharding, flattening data > >> to > >> fit in one row, throw away ACID, etc) but with home-grown software. I'd > >> like to adopt an active open-source project if it makes sense. > >> > >> Alternatives I'm also looking at: RDBMS fronted with Websphere eXtreme > >> Scale, RDBMS fronted with Hibernate/ehcache, or (the option I understand > >> the > >> least right now) memcached. > >> > >> Thanks, > >> Brian > >> -- > >> View this message in context: > >> http://old.nabble.com/Use-cases-of-HBase-tp27837470p27837470.html > >> Sent from the HBase User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > View this message in context: > http://old.nabble.com/Use-cases-of-HBase-tp27837470p27838006.html > Sent from the HBase User mailing list archive at Nabble.com. > >