Hi All,

I've asked this question in HBase mailing list, people suggested me better off 
ask it here :) so here I am. I am new to sqoop and having a use case where 
there is a few applications running in house independently, Let's say 
applications A, B, C. Each has its own DB associated. I wanna create a 
aggregated view on all the databases so that I don't have to jump into 
different dbs to find the info I need. Simply example will be all three 
applications have a table called "users", they are v similar, I wanna union the 
"users" table.

I've had a look at sqoop, looks like it allows me to move data from database 
A,B,C to a single/centralised place - e.g. HBase? 

The solution I am looking for ideally need to do the followings:

1. the centralised storage keeps updated reasonably quick as the original db 
(A, B, C) gets updated. By all means, I am not looking for one time bulk 
import, I wanna have incremental updates after the initial import.
2. As long as I provide a schema mapping, Can A,B,C be imported to a single 
place, e.g. single HBase table.

now, my question is:

Is Sqoop a suitable tool for this? I was originally considering to use mangodb 
and write the periodic/parallel import piece myself. But for now, I am leaning 
towards sqoop more since in house we have hadoop running already. Any advices 
are highly appreciated!

Thanks,
Shengjie 

Reply via email to