Re: importing a large table

Stack Thu, 29 Mar 2012 21:08:37 -0700

On Thu, Mar 29, 2012 at 7:57 PM, Rita <[email protected]> wrote:
> Hello,
>
> I am importing a 40+ billion row table which I exported several months ago.
> The data size is close to 18TB on hdfs (3x replication).
>


Does the table from back then still exist?  Or do you remember what
the key spread was like?  Could you precreate the old table?

> My problem is when I try to import it with mapreduce it takes a few days --
> which is ok -- however when the job fails to whatever reason, I have to
> restart everything. Is it possible to import the table in chunks like,
> import 1/3, 2/3, and then finally 3/3  of the table?
>

Yeah.  Funny how the plug gets pulled on the rack when the three day
job is at the end 95% done.

> Btw, the jobs creates close to 150k mapper jobs, thats a problem waiting to
> happen :-)
>

Are you running 0.92?  If not, you should and go for bigger regions.   10G?

St.Ack

Re: importing a large table

Reply via email to