Re: HBase as a transformation engine

Vincent Barat Wed, 13 Nov 2013 01:07:27 -0800

Hi,

We have done this kind of thing using HBase 0.92.1 + Pig, but wefinally had to limit the size of the tables and move the biggestdata to HDFS: loading data directly from HBase is much slower thanfrom HDFS, and doing it using M/R overloads HBase region servers,since several maps jobs scan table regions at the same time: so thebigger your tables are, the higher the load is (usually Pig creates1 map per region, I don't know about Hive).

This may not be an issue if your HBase cluster is dedicated to thiskind of job, but if you also have to ensure a good random readlatency at the same time, forget it.


Regards,

Le 11/11/2013 13:10, JC a écrit :

We are looking to use hbase as a transformation engine. In other words, take
data already loaded into hbase, run some large calculation/aggregation on
that data and then load it back into a rdbms for our BI analytic tools to
use. I was curious about what the communities experience is on this and if
there are some best practices. Some thoughts we are kicking around is using
Mapreduce 2 and Yarn and writing files to HDFS to be loaded into the rdbms.
Not sure what all the pieces are needed for the complete application though.

Thanks in advance for your help,
JC



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/HBase-as-a-transformation-engine-tp4052670.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: HBase as a transformation engine

Reply via email to