0.98.0.2.1.9.0-2196-hadoop2Hadoop 2.4.0.2.1.9.0-2196Subversion
[email protected]:hortonworks/hadoop-monarch.git -r cb50542bc92fb77dee52
No, the clusters were not taking additional load.
ThanksRama
> Date: Fri, 19 Dec 2014 13:50:30 -0800
> Subject: Re: HBase - bulk loading files
> From: [email protected]
> To: [email protected]
>
> Can you let us know the HBase and hadoop versions you're using ?
>
> Were the clusters taking load from other sources when ImportTsv was running
> ?
>
> Cheers
>
> On Fri, Dec 19, 2014 at 1:43 PM, Rama Ramani <[email protected]> wrote:
>
> > Hello, I am bulk loading a set of files (about 400MB each) with
> > "|" as the delimiter using ImportTsv. It takes a long time for the 'map'
> > job to complete on both a 4 node and a 16 node cluster. I tried the option
> > to generate the output (providing -Dimporttsv.bulk.output) which took time
> > indicating that the generation of the output files needs improvement.
> > I am seeing about 8000 rows / sec for this dataset, the 400MB ingestion
> > takes about 5-6 mins. How can I improve this? Is there an alternate tool I
> > can use?
> > ThanksRama