Shengjie,

This is a typical problem statement for data integration. You need to create 
centralize repository of data coming from different data sources. This 
centralized data repository (warehouse) will have data refreshed incrementally. 
This incremental refresh will assure you up-to-date data from all data sources. 
Once this repository build then you can write aggregates on this data. Sqoop 
can play some role here. But mostly it will be ETL operations and you can live 
with any ETL tool or pig.   Any specific reason of using 
Hbase here? 
Sent from HTC via Rocket.

----- Reply message -----
From: "shengjie min" <[email protected]>
To: <[email protected]>
Subject: ETL like merge databases to HBase
Date: Mon, Aug 5, 2013 6:24 AM


-Actually, it might be easier to go with a pure RDBMS solution here since 
nowadays the Slave/master architectures in postgre and MySQL are mature enough 
to handle this sort of thing even for hundreds of thousands of rows.

Let's assume RDBMS are from Customer's applications, I don't have that much 
grip on them and I don't want to mess around their environments that much too.

Shengjie

On 2 Aug 2013, at 10:17, Jay Vyas <[email protected]> wrote:

> Hbase doesn't have dynamic views on data outside of itself. But you can 
> easily re run your sqoop flow to dump information into hbase.
> 
> Actually, it might be easier to go with a pure RDBMS solution here since 
> nowadays the Slave/master architectures in postgre and MySQL are mature 
> enough to handle this sort of thing even for hundreds of thousands of rows.

Reply via email to