Thanks both of you! Harsh must be right. The source file of the Hbase version that I use seems to have the changes mentioned at https://issues.apache.org/jira/browse/MAPREDUCE-1853
> From: ha...@cloudera.com > Date: Fri, 2 Sep 2011 23:03:09 +0530 > Subject: Re: Problem when using MultipleOutputs with many files > To: mapreduce-user@hadoop.apache.org > > Hello David, > > MAPREDUCE-1853 was back-ported already into CDH3u0 [1]. That shouldn't > be the cause of Panagiotis's performance breaker, hence. > > P.s. Please do not upgrade to 0.21.x series in production, as it is > not deemed stable yet. This is noted on the Apache Hadoop website as > well. > > [1] - http://archive.cloudera.com/cdh/3/hadoop-0.20.2+923.21.releasenotes.html > > On Fri, Sep 2, 2011 at 8:39 PM, David Rosenstrauch <dar...@darose.net> wrote: > > On 09/02/2011 09:14 AM, Panagiotis Antonopoulos wrote: > >> > >> Hello guys, > >> > >> I am using hadoop-0.20.2-cdh3u0 and I use MultipleOutputs to divide the > >> HFiles (which are the output of my MR job) so that each file can fit into > >> one region of the table where I am going to bulk load them. > >> > >> Therefore I have one MultipleOutput per region and as a result I had 280 > >> different outputs. > >> I just realized that using so many outputs makes my job a lot slower than > >> it is when I have just one output. > >> > >> Do you know what goes wrong? Has anyone noticed the same? > >> > >> Thank you! > >> Panagiotis > > > > > > You're probably running into this bug, which crushes the performance of > > MultipleOutputs: > > > > https://issues.apache.org/jira/browse/MAPREDUCE-1853 > > > > Apparently it's fixed in v0.21, so try to upgrade if you can. > > > > I wasn't able to in our code however (we were also using Cloudera CDH, which > > as you see is 0.20). What I eventually wound up doing to work around it was > > to use our own local copy of the MultipleOutputs class (I called it > > BugFixMultipleOutputs_0_20) which I manually patched with the fix. > > > > HTH, > > > > DR > > > > > > -- > Harsh J