Hey Tom, Well, I was being a bit short, and for that I apologize. To elaborate: Cassandra was conceived of as a solution for a vastly different problem than data warehousing, and certain design decisions in the early days were made in light of the needs of OLTP data management. To the best of my knowledge, its primary users and contributors have continued that focus. The integration with Hadoop MapReduce is primarily useful for bulk import and export, as well as for facilitating data hygiene by making bulk transformations possible (e.g. recoding a column or enforcing a consistency constraint in an asynchronous fashion).
More generally, OLTP ("application data management") and data warehousing ("analytical data management") are two very different beasts, and to expect a single storage system to be optimal for both kinds of workloads is one place where I feel things went a bit wrong in the RDBMS world. I'm hopeful that we can avoid some of that confusion with these next generation storage systems, though the temptation of making both workloads happen in a single system is likely too large to be avoided. Something like https://issues.apache.org/jira/browse/HBASE-2357 may be helpful here if you insist on making both workloads happen in a single system. In any case, using Hive against an RCFile in HDFS is probably the best way to go in the short term for the data warehouse, as both the HBase and Cassandra support in Hive are experimental. Regards, Jeff On Wed, Jun 16, 2010 at 9:14 PM, tom kersnick <hiveu...@gmail.com> wrote: > You are not being rude Jeff. This is a request from the client due to ease > of use of Cassandra compared to Hbase. I'm with you on this. They are > looking for apples to apples consistency. Easy migration of data from OLTP > (Cassandra) to their Data Warehouse (Cassandra?). Apparently not. Is it > possible to migrate from Cassandra to Hbase? Any documentation on this > type > of push to Hbase from Cassandra would be helpful. > > Thanks in advance. > > /tom > > > > > > On Wed, Jun 16, 2010 at 5:44 PM, Jeff Hammerbacher <ham...@cloudera.com > >wrote: > > > Hey Tom, > > > > I don't want to be rude, but if you're using Cassandra for your data > > warehouse environment, you're doing it wrong. HBase is the primary focus > > for > > integration with Hive (see > > http://www.cloudera.com/blog/2010/06/integrating-hive-and-hbase/, > > example). Cassandra is a great choice for an OLTP application, but > > certainly > > not for a data warehouse. > > > > Later, > > Jeff > > > > On Wed, Jun 16, 2010 at 3:22 PM, tom kersnick <hiveu...@gmail.com> > wrote: > > > > > Quick question for all of you. Its seems that there is more movement > > using > > > Hive with Hbase rather than Cassandra. Do you see this changing in the > > > near > > > future? I have a client who is interested in using Cassandra due to > the > > > ease of maintenance. They are planning on using Cassandra for both > their > > > data warehouse and OLTP environments. Thoughts? > > > > > > I saw this ticket and I wanted to ask. > > > > > > Thanks in advance. > > > > > > /tom > > > > > > > > > On Mon, May 3, 2010 at 12:42 PM, Edward Capriolo < > edlinuxg...@gmail.com > > > >wrote: > > > > > > > On Thu, Apr 8, 2010 at 1:17 PM, shirish <shirishredd...@gmail.com> > > > wrote: > > > > > > > > > > All, > > > > > > > > > > > > http://code.google.com/soc/. > > > > > > > > > > > > It is an interesting thing that Google offers stipends to get > open > > > > source > > > > > > code written. However, last year I was was interested in a > project > > > that > > > > > did > > > > > > NOT get accepted into GSOC. It was quite deflating to be not > > > > > > accepted/rejected. > > > > > > > > > > > > Money does make the world go around, and if we all had plenty of > > > money > > > > we > > > > > > would all have more time to write open source code :) But on the > > > chance > > > > > > your > > > > > > application does get rejected consider doing it anyway! > > > > > > > > > > > > Edward > > > > > > > > > > > > > > > > Definitely Edward, Thanks for the suggestion :) > > > > > > > > > > shirish > > > > > > > > > > > > > I did not see any cassandra or hive SOC projects.... > > > > > http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010. > > > :( > > > > So > > > > if no one is going to pick this cassandra interface up I will pick it > > up > > > > after I close some pending things ....that is two strikes for me and > > > GSOC. > > > > > > > > > >