Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-30 Thread Jeremy Hanna
FWIW, we are using Pig (and Hadoop) with Cassandra and are looking to potentially move to Brisk because of the simplicity of operations there. Not sure what you mean about the true power of Hadoop. In my mind the true power of Hadoop is the ability to parallelize jobs and send each task to

Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-30 Thread Tharindu Mathew
Thanks Jeremy for your response. That gives me some encouragement, that I might be on that right track. I think I need to try out more stuff before coming to a conclusion on Brisk. For Pig operations over Cassandra, I only could find http://svn.apache.org/repos/asf/cassandra/trunk/contrib/pig.

Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-30 Thread Jeremy Hanna
I've tried to help out with some UDFs and references that help with our use case: https://github.com/jeromatron/pygmalion/ There are some brisk docs on pig as well that might be helpful: http://www.datastax.com/docs/0.8/brisk/about_pig On Aug 30, 2011, at 1:30 PM, Tharindu Mathew wrote:

Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-30 Thread Tharindu Mathew
Thanks Jeremy. These will be really useful. On Wed, Aug 31, 2011 at 12:12 AM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: I've tried to help out with some UDFs and references that help with our use case: https://github.com/jeromatron/pygmalion/ There are some brisk docs on pig as well that

Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-29 Thread Tharindu Mathew
Hi, I have an already running system where I define a simple data flow (using a simple custom data flow language) and configure jobs to run against stored data. I use quartz to schedule and run these jobs and the data exists on various data stores (mainly Cassandra but some data exists in RDBMS