Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by TedDunning: http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms The comment on the change is: added help in setting config parameters. ------------------------------------------------------------------------------ 1. Start by getting everything running (likely on a small input) in the local runner. You do this by setting your job tracker to "local" in your config. The local runner can run - under the debugger and runs on your development machine. + under the debugger and runs on your development machine. A very quick and easy way to set this + config variable is to include the following line just before you run the job: + + {{{conf.set("mapred.job.tracker", "local");}}} + + You may also want to do this to make the input and output files be in the local file system rather than in the Hadoop + distributed file system (HDFS): + + {{{conf.set("fs.default.name", "local");}}} + + You can also set these configuration parameters in {{{hadoop-site.xml}}}. The configuration files + {{{hadoop-default.xml}}}, {{{mapred-default.xml}}} and {{{hadoop-site.xml}}} should appear somewhere in your program's + class path when the program runs. + 2. Run the small input on a 1 node cluster. This will smoke out all of the issues that happen with distribution and the "real" task runner, but you only have a single place to look at logs. Most