Hi Taran, Have you a blog or something that explains how you process joins on cloudbase?
E.g. how are indexes used, and how do you go through the joining using the data files and index files. Do you look at all possible indexes, determine the cardinality of each and from this pick a join order, or do you start at both ends of the query and meet in the middle (if you know what I mean...). This really influences schema design for large datasets in mysql (e.g. you need to store your own cardinality as mysql can't determine the best join order inherently) so I am wondering about porting my reporting application. I think this kind of info would be great for cloudbase docs. Cheers, Tim 2009/3/3 Tarandeep Singh <[email protected]>: > Tim is right. CloudBase is not equivalent to HBase. > > HBase is column oriented database based on Google's BigTable. CloudBase is a > database/data warehosue layer on top of Hadoop and by means of its SQL > interface makes it easier to mine logs. So instead of writing Map-Reduce > jobs for analyzing data, one can use SQL to do the same and SQL to Map > Reduce job translation is handled by CloudBase. > > -Taran > > 2009/3/3 tim robertson <[email protected]> > >> Hi Praveen, >> >> I think it is more equivalent to Hive than HBase - both offer joins >> and structured querying where HBase is more a column oriented data >> store with many to ones embedded in a single row and (currently) only >> indexes on the primary key, but secondary keys are coming. I >> anticipate using HBase as a back end to harvest into, but might make >> use of Hive or Cloudbase for ad hoc reporting when needed. >> >> Has anyone done any testing of Hive vs. Cloudbase for performance and >> comparison of features? >> >> Cheers, >> >> Tim >> >> >> 2009/3/3 Guttikonda, Praveen <[email protected]>: >> > Hi , >> > Will this be competing in a sense with HBASE then ? >> > >> > Cheers, >> > Praveen >> > >> > -----Original Message----- >> > From: Tarandeep Singh [mailto:[email protected]] >> > Sent: Tuesday, March 03, 2009 10:12 PM >> > To: [email protected] >> > Subject: Re: Announcing CloudBase-1.2.1 release >> > >> > Hi Lukas, >> > >> > Yes, you are right. As of now, CloudBase does not support unique keys and >> foreign keys on tables. CloudBase is designed as a database abstraction >> layer on top of Hadoop, thus making it easier to query/mine logs/huge data >> easily. >> > >> > -Taran >> > >> > >> > On Tue, Mar 3, 2009 at 1:15 AM, Lukáš Vlček <[email protected]> >> wrote: >> > >> >> Hi Taran, >> >> This looks impressive. I quickly looked at the documentation, am I >> >> right that it does not support unique keys and foreign keys for tables? >> >> >> >> Regards, >> >> Lukas >> >> >> >> On Mon, Mar 2, 2009 at 8:33 PM, Tarandeep Singh <[email protected]> >> >> wrote: >> >> >> >> > Hi, >> >> > >> >> > We have just released 1.2.1 version of CloudBase on sourceforge- >> >> > http://cloudbase.sourceforge.net >> >> > >> >> > [ CloudBase is a data warehouse system built on top of Hadoop's >> >> Map-Reduce >> >> > architecture. It uses ANSI SQL as its query language and comes with >> >> > a >> >> JDBC >> >> > driver. It is developed by Business.com and is released to open >> >> > source community under GNU GPL license] >> >> > >> >> > This release fixes one issue with the 1.2 release- Table Indexing >> >> > feature was not enabled in the 1.2 release. This release fixes this >> issue. >> >> > >> >> > Also we have updated the svn repository on the sourceforge site and >> >> > we invite contributors to work with us to improve CloudBase. The svn >> >> > repository url is- >> >> > https://cloudbase.svn.sourceforge.net/svnroot/cloudbase/trunk >> >> > >> >> > We will be uploading Developer's guide/documentation on the >> >> > CloudBase website very soon. Meanwhile, if someone wants to try >> >> > compiling the code and play around with it, please contact me, I can >> >> > help you get started. >> >> > >> >> > Thanks, >> >> > Taran >> >> > >> >> >> >> >> >> >> >> -- >> >> http://blog.lukas-vlcek.com/ >> >> >> > >> >
