I just advice to use MultipleOutputFormat, instead of MultipleOurput.write --Send from my Sony mobile. On Jun 29, 2013 9:16 PM, "David Poisson" <[email protected]> wrote:
> Just thought I'd provide some insight into our problem. > > It appears that the problem was a slowdown caused by the use of > multipleOutputs.write(output, key, keyValue, path) (going from memory > here). Anyways, after looking at the implementation of that write function > in multipleOutputs.java it appears that a context was created and a conf > was gotten and a new recordWriter was gotten for every call to > write(output, key, keyValue, path). > > We have changed all of those calls to write(output, key, keyValue) (which > doesn't do any extra things) and it seems to help. > > Anyone else has any tips when using multipleOutputs? > > We are taking our input and splitting it into 3 files. So it seems to be a > natural choice for MultipleOutputs. Performance is a bit slow though. > > Cheers! > > David > ________________________________________ > From: David Poisson [[email protected]] > Sent: Thursday, June 27, 2013 4:22 PM > To: [email protected] > Subject: Profiling map reduce jobs? > > Howdy, > I want to take a look at a MR job which seems to be slower than I had > hoped. Mind you, this MR job is only running on a pseudo-distributed VM > (cloudera cdh4). > > I have modified my mapred-site.xml with the following (that last one is > commented out because it crashes my MR job): > > <property> > <name>mapred.task.profile</name> > <value>true</value> > </property> > <property> > <name>mapred.task.profile.maps</name> > <value>0-2</value> > </property> > <property> > <name>mapred.task.profile.reduces</name> > <value>0-2</value> > </property> > <!--property> > <name>mapred.task.profile.params</name> > > <value>agentlib:hprof=cpu=samples,heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s</value> > </property--> > Are there any resources that explain how to interpret the results? > Or maybe an open-source app that could help display the results in a more > intuiative manner? > > Ideally, we'd want to know where we are spending most of our time. > > Cheers, > > David
