Re: HBase and Datawarehouse

Andrew Purtell Tue, 30 Apr 2013 01:07:55 -0700

I don't wish to be rude, but you are making odd claims as fact as
"mentioned in a couple of posts". It will be difficult to have a serious
conversation. I encourage you to test your hypotheses and let us know if in
fact there is a JVM "heap barrier" (and where it may be).


On Monday, April 29, 2013, Asaf Mesika wrote:

> I think for Pheoenix truly to succeed, it's need HBase to break the JVM
> Heap barrier of 12G as I saw mentioned in couple of posts. since Lots of
> analytics queries utilize memory, thus since its memory is shared with
> HBase, there's so much you can do on 12GB heap. On the other hand, if
> Pheonix was implemented outside HBase on the same machine (like Drill or
> Impala is doing), you can have 60GB for this process, running many OLAP
> queries in parallel, utilizing the same data set.
>
>
>
> On Mon, Apr 29, 2013 at 9:08 PM, Andrew Purtell 
> <[email protected]<javascript:;>>
> wrote:
>
> > > HBase is not really intended for heavy data crunching
> >
> > Yes it is. This is why we have first class MapReduce integration and
> > optimized scanners.
> >
> > Recent versions, like 0.94, also do pretty well with the 'O' part of
> OLAP.
> >
> > Urban Airship's Datacube is an example of a successful OLAP project
> > implemented on HBase: http://github.com/urbanairship/datacube
> >
> > "Urban Airship uses the datacube project to support its analytics stack
> for
> > mobile apps. We handle about ~10K events per second per node."
> >
> >
> > Also there is Adobe's SaasBase:
> > http://www.slideshare.net/clehene/hbase-and-hadoop-at-adobe
> >
> > Etc.
> >
> > Where an HBase OLAP application will differ tremendously from a
> traditional
> > data warehouse is of course in the interface to the datastore. You have
> to
> > design and speak in the language of the HBase API, though Phoenix (
> > https://github.com/forcedotcom/phoenix) is changing that.
> >
> >
> > On Sun, Apr 28, 2013 at 10:21 PM, anil gupta 
> > <[email protected]<javascript:;>
> >
> > wrote:
> >
> > > Hi Kiran,
> > >
> > > In HBase the data is denormalized but at the core HBase is KeyValue
> based
> > > database meant for lookups or queries that expect response in
> > milliseconds.
> > > OLAP i.e. data warehouse usually involves heavy data crunching. HBase
> is
> > > not really intended for heavy data crunching. If you want to just store
> > > denoramlized data and do simple queries then HBase is good. For OLAP
> kind
> > > of stuff, you can make HBase work but IMO you will be better off using
> > Hive
> > > for  data warehousing.
> > >
> > > HTH,
> > > Anil Gupta
> > >
> > >
> > > On Sun, Apr 28, 2013 at 8:39 PM, Kiran 
> > > <[email protected]<javascript:;>>
> wrote:
> > >
> > > > But in HBase data can be said to be in  denormalised state as the
> > > > methodology
> > > > used for storage is a (column family:column) based flexible schema
> > .Also,
> > > > from Google's  big table paper it is evident that HBase is capable of
> > > doing
> > > > OLAP.SO where does the difference lie?
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/HBase-and-Datawarehouse-tp4043172p4043216.html
> > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > >
> > >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: HBase and Datawarehouse

Reply via email to