1) Eventual Consistency isn't a problem here. HBase is a strict consistency system. Maybe you have us confused with other Dynamo-based Open Source projects? 2) MySQL and other traditional RDBMS systems are definitely a lot more solid, well-tested, and subtlety tuned than HBase. The vast majority (if not all) of database systems developed in the past decade have this provlem. HBase has 2 main advantages over a traditional RDBMS workload for OLTP: A. Large-scale workloads : Facebook Messages have a constant growing set of data that is 1PB+. And we're growing at 250MB/month. This is hard to manage this with a traditional RDBMS. Logical database sharding is extremely useful. B. Write-dominated workloads : Examples like time-series databases, user analytics, etc are very write-heavy. A LSMT approach is architecturally better than a B-tree approach. Having done system testing internally, we already see IOPS advantage with HBase over MySQL in writes. 3) A big question is what you need out of a database system. Most web companies are worried about the 'large-scale workloads' problem if their site becomes popular, so a working familiarity with a distributed database system for less mission-critical applications is worthwhile even if the performance and reliability isn't there yet. 4) If you have any mission-critical data, you really should think about a disaster recovery plan outside of HBase, which is not as critical with a traditional RDBMS. Facebook Messages ends up using Scribe as a backup mechanism. We are currently working on HBase Snapshots to allow disaster recovery with HBase alone, but you shouldn't hedge bets on it being completed within your timeframe.
On 1/9/12 2:31 PM, "Michael Segel" <[email protected]> wrote: > > >All, > >Just my $0.02 worth of 'expertise'... > >1) Just because you can do something doesn't mean you should. >2) One should always try to use the right tool for the job regardless of >your 'fashion sense'. >3) Just because someone says "Facebook or Yahoo! does X", doesn't mean >its a good idea, or its the right choice for you and your team. > >Having said that... > >Yes, you can use HBase to handle OLTP queries. However you do not have >transactional capabilities built in such that you will have to manage >them within your application. >Not really an easy task when you think about it. It really depends on >what you want to do with your OLTP system. Hotel reservation systems not >really a good idea.... > >There are some inherent problems with HBase in an OLTP environment. > >1) Eventual consistency. You can google the CAP theorem and you'll see >why this is an issue. >2) Lack of transaction support. Note: Row Level Locking that is in HBase >has nothing to do with Row Level Locking with respect to transactional >support. >3) HBase size and scale vs RDBMS. For OLTP, RDBMS is the best tool for >the job. So why do you want to use HBase over what one could call the >'defacto' standard? >The point here on #3 is that the normal tool of choice is an RDBMS. So >you really, really need to justify why you're not going with this. I mean >there could be a valid reason, but in most cases no. > >Where dhruba indicates that HBase is a pure transaction system, and does >support OLTP workloads... absolutely Not! > >So what I suggest is that if you want to do OLTP in HBase, the first >thing you have to do is to prove that you can't solve the problem in an >RDBMs. > >Having said all that... I'm going to shut now... ;-) > >-Mike > > > >> Date: Mon, 9 Jan 2012 10:55:45 -0800 >> Subject: Re: Question about HBase for OLTP >> From: [email protected] >> To: [email protected] >> CC: [email protected] >> >> > I know HBase is designed for OLAP, query intensive type of >>applications. >> >> That is not entirely true. HBase is a pure transaction system and does >>OLTP >> workloads for us. We probably more than 2 millions ops/sec for one of >>our >> application, details here: >> https://www.facebook.com/note.php?note_id=454991608919 >> >> -dhruba >> >> >> On Mon, Jan 9, 2012 at 9:25 AM, fullysane <[email protected]> wrote: >> >> > >> > Hi >> > >> > I know HBase is designed for OLAP, query intensive type of >>applications. >> > But >> > I like the flexibility feature of its column-base architecture which >>allows >> > me having no need to predefine every column of a table and I can >> > dynamically >> > add new column with value in my OLTP application code and capture its >>meta >> > data information. >> > >> > My question is basically about if we can use HBase for OLTP >>application >> > database. I know Hbase works well with Inserting column data of a row >>key >> > and set new version for the new piece of the data, and not so well for >> > updating and deleting existing piece of data. However, if I turn OLTP >> > update >> > and delete operations into all insertion of new version of colum data >>as I >> > described below: >> > For OLTP data update, if I set my table column family¹s versioning to >>1 and >> > always do insert (put) when there is need to update an existing data >>row >> > columns, and let Hbase to handle the delete of the old versions >>through DB >> > garbage collection. >> > For OLTP data delete, I can use inserting new version on a flag field >>to >> > ³deleted², which is a logical delete, and have some batch job to >>clean up >> > all logically deleted rows later. >> > >> > Will the above scenario work for using HBase for an OLTP application? >>Any >> > flaws on doing it? >> > >> > Can some one share the experiences of using HBase for OLTP >>applications? >> > >> > Thanks, >> > >> > -- >> > View this message in context: >> > >>http://old.nabble.com/Question-about-HBase-for-OLTP-tp33107782p33107782.h >>tml >> > Sent from the HBase User mailing list archive at Nabble.com. >> > >> > >> >> >> -- >> Subscribe to my posts at http://www.facebook.com/dhruba >
