I think that you can use the bulk loads feature described here:
http://hbase.apache.org/bulk-loads.html

Best wishes

On 03/29/2012 09:57 PM, Rita wrote:
Hello,

I am importing a 40+ billion row table which I exported several months ago.
The data size is close to 18TB on hdfs (3x replication).

My problem is when I try to import it with mapreduce it takes a few days --
which is ok -- however when the job fails to whatever reason, I have to
restart everything. Is it possible to import the table in chunks like,
import 1/3, 2/3, and then finally 3/3  of the table?

Btw, the jobs creates close to 150k mapper jobs, thats a problem waiting to
happen :-)








--
Marcos Luis Ortíz Valmaseda (@marcosluis2186)
 Data Engineer at UCI
 http://marcosluis2186.posterous.com



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to