On Thu, Mar 29, 2012 at 7:57 PM, Rita <[email protected]> wrote: > Hello, > > I am importing a 40+ billion row table which I exported several months ago. > The data size is close to 18TB on hdfs (3x replication). >
Does the table from back then still exist? Or do you remember what the key spread was like? Could you precreate the old table? > My problem is when I try to import it with mapreduce it takes a few days -- > which is ok -- however when the job fails to whatever reason, I have to > restart everything. Is it possible to import the table in chunks like, > import 1/3, 2/3, and then finally 3/3 of the table? > Yeah. Funny how the plug gets pulled on the rack when the three day job is at the end 95% done. > Btw, the jobs creates close to 150k mapper jobs, thats a problem waiting to > happen :-) > Are you running 0.92? If not, you should and go for bigger regions. 10G? St.Ack
