I don't have any particular experience with this, but perhaps X-Trace [1] can help. The presentation given at the Hadoop Summit was very impressive, looks like a great debugging tool. There are hooks already in Hadoop, so I think it's just a matter of enabling them, collecting the data, and generating the pretty graphs, at which point hopefully the cause becomes clear.
n [1] http://www.x-trace.net/ On Mon, Mar 31, 2008 at 12:07 PM, Colin Freas <[EMAIL PROTECTED]> wrote: > I've set up a job to run on my small 4 (sometimes 5) node cluster on dual > processor server boxes with 2-8GB of memory. > > My job processes 24 100-300MB files that are a days worth of logs, total > data is about 6GB. > > I've modified the word count example to do what I need, and it works fine on > small test files. > > I've set the number of map tasks at 200, the number of reduce tasks to 14. > Things seem to go along fine, the map % climbs nicely, along with the > reduce. Once the map hits 100% though, the reduce % stops increasing. > Right now it's stuck around 58%. I was hoping changing the number of reduce > tasks would help, but I'm not really sure it did. I had tried this once > before with the default number of deduce jobs, and I got to 100% (Map) and > 14% (Reduce) before I saw this hanging behavior. > > I'm just trying to understand what's happening here, and if there's > something I can do to increase the performance, short of adding nodes. Is > it likely I've set something up incorrectly somewhere? > > Any help appreciated. > > Thanks! > > -Colin >
