Re: HBase - bulk loading files

Ted Yu Fri, 09 Jan 2015 14:15:32 -0800

Salted buckets seem to be concept from other projects, such as Phoenix.

Can you be a bit more specific about your requirement ?


Cheers

On Fri, Jan 9, 2015 at 12:53 PM, Rama Ramani <[email protected]> wrote:

> Is there a way to specify Salted buckets with HBase ImportTsv while doing
> bulk load?
>
> Thanks
> Rama
>
> From: [email protected]
> To: [email protected]
> Subject: RE: HBase - bulk loading files
> Date: Fri, 19 Dec 2014 14:09:09 -0800
>
>
>
>
> 0.98.0.2.1.9.0-2196-hadoop2Hadoop 2.4.0.2.1.9.0-2196Subversion
> [email protected]:hortonworks/hadoop-monarch.git -r cb50542bc92fb77dee52
> No, the clusters were not taking additional load.
> ThanksRama
> > Date: Fri, 19 Dec 2014 13:50:30 -0800
> > Subject: Re: HBase - bulk loading files
> > From: [email protected]
> > To: [email protected]
> >
> > Can you let us know the HBase and hadoop versions you're using ?
> >
> > Were the clusters taking load from other sources when ImportTsv was
> running
> > ?
> >
> > Cheers
> >
> > On Fri, Dec 19, 2014 at 1:43 PM, Rama Ramani <[email protected]>
> wrote:
> >
> > > Hello,         I am bulk loading a set of files (about 400MB each) with
> > > "|" as the delimiter using ImportTsv. It takes a long time for the
> 'map'
> > > job to complete on both a 4 node and a 16 node cluster. I tried the
> option
> > > to generate the output (providing -Dimporttsv.bulk.output) which took
> time
> > > indicating that the generation of the output files needs improvement.
> > > I am seeing about 8000 rows / sec for this dataset, the 400MB ingestion
> > > takes about 5-6 mins. How can I improve this? Is there an alternate
> tool I
> > > can use?
> > > ThanksRama
>
>
>

Re: HBase - bulk loading files

Reply via email to