[Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by TedDunning

Apache Wiki Mon, 27 Aug 2007 22:59:06 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by TedDunning:
http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms

The comment on the change is:
added help in setting config parameters.

------------------------------------------------------------------------------
  
   1. Start by getting everything running (likely on a small input) in the 
local runner. 
      You do this by setting your job tracker to "local" in your config. The 
local runner can run 
-     under the debugger and runs on your development machine.
+     under the debugger and runs on your development machine.  A very quick 
and easy way to set this 
+     config variable is to include the following line just before you run the 
job:
+ 
+     {{{conf.set("mapred.job.tracker", "local");}}}
+ 
+     You may also want to do this to make the input and output files be in the 
local file system rather than in the Hadoop 
+     distributed file system (HDFS):
+ 
+     {{{conf.set("fs.default.name", "local");}}}
+ 
+     You can also set these configuration parameters in {{{hadoop-site.xml}}}. 
 The configuration files 
+     {{{hadoop-default.xml}}}, {{{mapred-default.xml}}} and 
{{{hadoop-site.xml}}} should appear somewhere in your program's 
+     class path when the program runs.
+ 
  
   2. Run the small input on a 1 node cluster. This will smoke out all of the 
issues that happen with
      distribution and the "real" task runner, but you only have a single place 
to look at logs. Most

[Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by TedDunning

Reply via email to