Could you pastebin the Terasort configuration xml file? I have run Terasort over 1000 times but I have never seen this problem.
How did you generate the data for terasort? using teragen or some other method? Raj Stoser Analytics www.stoser.com >________________________________ >From: W.P. McNeill <[email protected]> >To: Hadoop Mailing List <[email protected]> >Cc: Vitor Carvalho <[email protected]>; [email protected] >Sent: Friday, November 4, 2011 11:55 AM >Subject: Terrasort sends everything to a single reducer–Don't Apologize, David >Salle > >I'm trying to run a TeraSort job to confirm that my cluster is set up >correctly. The mappers perform fine, but in the reduce stage all the data >is sent to a single node. My mapred.reduce.tasks parameter is set to an >appropriate value greater than 1. I am launching multiple reducers, but >only one of them is receiving input. > >It looks like the TeraSort partition function is buggy, but there's no way >that it would have a bug this obvious. I've looked for configuration errors >on my part and found none. So now I'm asking if anyone else has seen this >problem and can explain it. > >In the archives from February 27 of this year David Salle's post "TeraSort >bug?<http://grokbase.com/t/hadoop.apache.org/common-user/2011/02/terasort-bug/27pzea46iowbfkbd4l5y566i4iv4>" >describes what appears to be the same problem, but the only response I see >is from David Salle the next day, apologizing and saying to ignore his >previous post. Presumably he found some mistake on his end that he thought >was trivial, but it doesn't look trivial to me. > > >
