Re: How to debug reducer thread?

psdc1978 Sat, 01 May 2010 10:48:55 -0700

I've other idea that I don't know how to do it. Is it possible to set Xdebug
parameter to the ReduceTask that is instanced by a JVM of the MapRed? If
it's possible, I could connect the debugger to that thread, right?


On Sat, May 1, 2010 at 4:43 PM, psdc1978 <psdc1...@gmail.com> wrote:

> Hi,
>
> I really need to debug the threads the ReduceTask will launch, and not
> using unit tests. The reason is that I'm seeing what's happening in the
> ReduceTask to do some changes to the code for myself. So, I was trying to
> debug the ReduceTask setting the following in mapred-site.xml
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>local</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
>
> But I can't start mapred, it gives me the error:
> 2010-05-01 17:35:35,155 FATAL org.apache.hadoop.mapred.JobTracker:3720
> java.lang.RuntimeException: Not a host:port pair: local
>         at
> org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:136)
>         at
> org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:123)
>         at
> org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1794)
>         at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1581)
>         at
> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:179)
>         at
> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:171)
>         at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3717)
>
>
> I haven't set the fs.default.name parameter, because I will use HDFS and
> not the local filesystem.
>
> So, how can I solve this problem?
>
> Thanks,
> PSC
>
>
> On Wed, Apr 28, 2010 at 4:51 AM, Eric Sammer <esam...@cloudera.com> wrote:
>
>> If you want to step through a full map / reduce job, the easiest way
>> to do this is to run a job using the local job runner in your IDE. The
>> local job runner will run the MR job in a single thread making it very
>> easy to debug. You will want to use the local file system and a small
>> amount of data during this type of testing / debugging. Note that the
>> local job runner runs map tasks, sort and shuffle, and reducers
>> sequentially with no parallelism.
>>
>> Set the following properties to enable the local job runner and local
>> file system:
>>
>> mapred.job.tracker = local
>> fs.default.name = file:///
>>
>> Attempting to attach a debugger to a real task tracker is problematic
>> because user code is run in separate jvms, etc. It's almost never
>> worth it. Most debugging (with a real debugger) is better done using
>> MRUnit and the local job runner.
>>
>> Hope this helps and good luck.
>>
>> On Tue, Apr 27, 2010 at 7:27 AM, psdc1978 <psdc1...@gmail.com> wrote:
>> > Hi,
>> >
>> > The reduce tasks are threads that are launched by the Reducer. The print
>> > below shows the stacktrace of one reduce task.
>> >
>> > at
>> >
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.fetchHashesOutputs(ReduceTask.java:2582)
>> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
>> > at org.apache.hadoop.mapred.Child.main(Child.java:194)
>> >
>> > I would like to debug this thread in a IDE but I don't know how to do
>> it.
>> > Should I define properties to do this? Is there a way to do it?
>> >
>> > Thanks
>> >
>> > --
>> > PSC
>> >
>>
>>
>>
>> --
>> Eric Sammer
>> phone: +1-917-287-2675
>> twitter: esammer
>> data: www.cloudera.com
>>
>
>
>
> --
> Pedro
>



-- 
Pedro

Re: How to debug reducer thread?

Reply via email to