[Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by Amareshwari

Apache Wiki Thu, 27 Sep 2007 23:44:09 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by Amareshwari:
http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms

------------------------------------------------------------------------------
  
  This can be extremely useful to display debug information about the current 
record being handled, or setting certain debug flags about the status of the 
mapper. While running locally on a small data set can find many bugs, large 
data sets may contain pathological cases that are otherwise unexepcted. This 
method of debugging can help catch those cases.
  
+ == Run a debug script when Task fails ==
+ 
+ A facility is provided, via user-provided scripts, for doing post-processing 
on task logs, task's stdout, stderr, core file.There is a default script which 
processes core dumps under gdb and prints stack trace. The last five lines from 
stdout and stderr of debug script are printed on the diagnostics. These outputs 
are displayed job UI on demand. 
+ 
+ == How to submit debug command ==
+ 
+ A very quick and easy way to set debug command is to set the properties 
mapred.map.task.debug.command and mapred.reduce.task.debug.command for 
debugging map task and reduce task respectively.
+ These properties can also be set by APIs conf.setMapDebugCommand(String cmd) 
and conf.setReduceDebugCommand(String cmd).
+ The command can consists of @stdout@, @stderr@, @core@ to access task's 
stdout, stderr and core files respectively.
+ 
+ == How to submit debug script ==
+ 
+ 
  = How to debug Hadoop Pipes programs =
  
  In order to debug Pipes programs you need to keep the downloaded commands.

[Lucene-hadoop Wiki] Update of "HowToDebugMapReducePrograms" by Amareshwari

Reply via email to