Hi, I really need to debug the threads the ReduceTask will launch, and not using unit tests. The reason is that I'm seeing what's happening in the ReduceTask to do some changes to the code for myself. So, I was trying to debug the ReduceTask setting the following in mapred-site.xml
<property> <name>mapred.job.tracker</name> <value>local</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> But I can't start mapred, it gives me the error: 2010-05-01 17:35:35,155 FATAL org.apache.hadoop.mapred.JobTracker:3720 java.lang.RuntimeException: Not a host:port pair: local at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:136) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:123) at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1794) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1581) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:179) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:171) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3717) I haven't set the fs.default.name parameter, because I will use HDFS and not the local filesystem. So, how can I solve this problem? Thanks, PSC On Wed, Apr 28, 2010 at 4:51 AM, Eric Sammer <esam...@cloudera.com> wrote: > If you want to step through a full map / reduce job, the easiest way > to do this is to run a job using the local job runner in your IDE. The > local job runner will run the MR job in a single thread making it very > easy to debug. You will want to use the local file system and a small > amount of data during this type of testing / debugging. Note that the > local job runner runs map tasks, sort and shuffle, and reducers > sequentially with no parallelism. > > Set the following properties to enable the local job runner and local > file system: > > mapred.job.tracker = local > fs.default.name = file:/// > > Attempting to attach a debugger to a real task tracker is problematic > because user code is run in separate jvms, etc. It's almost never > worth it. Most debugging (with a real debugger) is better done using > MRUnit and the local job runner. > > Hope this helps and good luck. > > On Tue, Apr 27, 2010 at 7:27 AM, psdc1978 <psdc1...@gmail.com> wrote: > > Hi, > > > > The reduce tasks are threads that are launched by the Reducer. The print > > below shows the stacktrace of one reduce task. > > > > at > > > org.apache.hadoop.mapred.ReduceTask$ReduceCopier.fetchHashesOutputs(ReduceTask.java:2582) > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395) > > at org.apache.hadoop.mapred.Child.main(Child.java:194) > > > > I would like to debug this thread in a IDE but I don't know how to do it. > > Should I define properties to do this? Is there a way to do it? > > > > Thanks > > > > -- > > PSC > > > > > > -- > Eric Sammer > phone: +1-917-287-2675 > twitter: esammer > data: www.cloudera.com > -- Pedro