Hi guys,

TO simplify my question, Let's say, I have a mysql table called 'student', 
looks like this:

+----+----------+-----+
| id | name     | sex | 
+----+----------+-----+
|  1 | Alice       |   0  | 
|  2 | Bob         |   1  |  
|  3 | Charles  |   1  |  
+----+----------+-----+

I want to import this table to HBase periodically which means I will run this 
sqoop job periodically. There are two goals:

A.  every time there is a new record inserted to mysql table, e.g. (4, David, 
1), I hope my next sqoop import will catch it and put it in HBase.
B. if  there is any updates have been made to mysql rows 1, 2, 3, I want to 
have the updates in HBase too after next round sqoop import.

I checked two types incremental updates sqoop has:  Append mode seems only 
satisfied goal A while Last-modified mode will require my mysql table has a 
timestamp column for each row(which I don't in real life). I know if I don't 
use incremental updates options at all, I can just get way with it by running a 
fresh import every time, but if my mysql table is really huge and fresh import 
might be a performance killer.

Is there anyway I can just do incremental updates instead of having to re-run 
the whole import to get NEW RECORDS + UPDATES ON OLD ROWS?


Shengjie

Reply via email to