Re: Bulk Load question.

Vivek Krishna Mon, 21 Mar 2011 14:34:34 -0700

Thanks Harsh..

Viv




On Sat, Mar 19, 2011 at 11:52 AM, Harsh J <[email protected]> wrote:

> Have you tried out the mix of importtsv + completebulkload? Would that
> work for you?
>
> On Sat, Mar 19, 2011 at 9:18 PM, Vivek Krishna <[email protected]>
> wrote:
> > I have around 20 GB of data to be dumped into a hbase table.
> >
> > Initially, I had a simple java program to put the values in a batch of
> > (5000-10000) records.  I tried concurrent inserts and each insert took
> about
> > 15 seconds to write.  Which is very slow and was taking ages.
> >
> > Next approach was to use importtsv, this started off with a set of maps
> and
> > after few minutes, I started getting RetriesException and errors out in a
> > while.
> >
> > Of these experiments, I noticed that the master node was handing all the
> > traffic.  I understand that initially it dumps data in one node and then
> > splits across multiple nodes as data comes in.  Is there a way to split
> this
> > across regions in the beginning?
> >
> > Or any other thoughts on how to handle inserts of large amounts of data?
> > Viv
> >
>
>
>
> --
> Harsh J
> http://harshj.com
>

Re: Bulk Load question.

Reply via email to