Re: Hive on Hbase

Gunnar Tapper Thu, 17 Nov 2016 07:05:51 -0800

Apache Trafodion provides SQL on top of HBase.

On Thu, Nov 17, 2016 at 7:40 AM, Mich Talebzadeh <[email protected]>
wrote:


> thanks John.
>
> How about using Phoenix or using Spark RDDs on top of Hbase?
>
> Many people think Phoenix is not a good choice?
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 17 November 2016 at 14:24, John Leach <[email protected]> wrote:
>
> > Mich,
> >
> > I have not found too many happy users of Hive on top of HBase in my
> > experience.  For every query in Hive, you will have to read the data from
> > the filesystem into hbase and then serialize the data via an HBase
> scanner
> > into Hive.  The throughput through this mechanism is pretty poor and now
> > when you read 1 million records you actually read 1 Million records in
> > HBase and 1 Million Records in Hive.  There are significant resource
> > management issues with this approach as well.
> >
> > At Splice Machine (open source), we have written an implementation to
> read
> > the store files directly from the file system (via embedded Spark) and
> then
> > we do incremental deltas with HBase to maintain consistency.  When we
> read
> > 1 million records, Spark reads most of them directly from the filesystem.
> > Spark provides resource management and fair scheduling of those queries
> as
> > well.
> >
> > We released some of our performance results at HBaseCon East in NYC.
> Here
> > is the video.  https://www.youtube.com/watch?v=cgIz-cjehJ0 <
> > https://www.youtube.com/watch?v=cgIz-cjehJ0> .
> >
> > Regards,
> > John Leach
> >
> > > On Nov 17, 2016, at 6:09 AM, Mich Talebzadeh <
> [email protected]>
> > wrote:
> > >
> > > H,
> > >
> > > My approach to have a SQL engine on top of Hbase has been (excluding
> > Spark
> > > & Phoenix for now) is to create Hbase table as is, then create an
> > EXTERNAL
> > > Hive table on Hbase using Hadoop.hive.HbaseStorageHandler to interface
> > with
> > > Hbase table.
> > >
> > > My reasoning with creating Hive external table is to avoid accidentally
> > > dropping Hbase table etc. Is this a reasonable approach?
> > >
> > > Then that Hive table can be used by a variety of tools like Spark,
> > Tableau,
> > > Zeppelin.
> > >
> > > Is this a viable solution as Hive seems to be preferred on top of Hbase
> > > compared to Phoenix etc.
> > >
> > > Thaks
> > >
> > > Dr Mich Talebzadeh
> > >
> > >
> > >
> > > LinkedIn * https://www.linkedin.com/profile/view?id=
> > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > > <https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> > OABUrV8Pw>*
> > >
> > >
> > >
> > > http://talebzadehmich.wordpress.com
> > >
> > >
> > > *Disclaimer:* Use it at your own risk. Any and all responsibility for
> any
> > > loss, damage or destruction of data or any other property which may
> arise
> > > from relying on this email's technical content is explicitly
> disclaimed.
> > > The author will in no case be liable for any monetary damages arising
> > from
> > > such loss, damage or destruction.
> >
> >
>



-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: Hive on Hbase

Reply via email to