Re: hi Kubes:the question about develop environment!

askNutch Wed, 22 Apr 2009 23:39:51 -0700

hi kubes:
thank you for your answers!
i'm sorry that i didn't express my question.
i run nutch only on one machine! and ,i cann't debug hadoop in nutch.because
the hadoop's exist is lib.
how can i debug hadoop source in nutch?


and to my surprise ,the Tutorial "RunNutchInEclipse1.0" doesn't start and
configure hadoop ,include master listen port etc.
when i debug nutch through breakpoint, it display:"there is no source file
attached to the class file URLClassPath.class!" why?

can hadoop run in vmware machine?

and i also met other problers ,it is in another message "  run nutch on
eclipse problem? "'

thanks !!!

Dennis Kubes-2 wrote:
> 
> 
> 
> askNutch wrote:
>> hi Kubes: 
>>         You are the expert!
>>         
>>         Can you tell me What is the develop environment do you use to
>> develop nutch ?
> 
> Linux, Ubuntu (usually the most recent), sun jdk, core2 laptop (although 
> hoping to upgrade to a sagernotebook.com quad core soon :) ), Eclipse 
> stable (3.4 I think).
>>         
>>         such as IDE etc.
>>     
>>         I want to debug nutch.
> 
> Debugging MapReduce, hence Nutch, jobs is difficult.  The main reason 
> why is because Hadoop/Nutch spin up a new JVM for each Map and Reduce 
> job so it is difficult to connect to that JVM as it is created and 
> launched automagically.  Here are some options depending on what you are 
> trying to debug:
> 
> 1) Run all hadoop servers processes (namenode, etc.) through eclipse 
> using the internal debugger.  This isn't always the best way, usually 
> only used when debugging some part of the hadoop infrastructure such as 
> socket communication.
> 
> 2) Run most of the hadoop servers in separate processes, run the 
> tasktracker inside of eclipse with the internal debugger.  This is 
> mainly used when debugging a specific MapRunner, MapTask, or ReduceTask 
> interacting with Hadoop.  You won't be able to debug the Map or Reduce 
> task itself, just the communication with the Hadoop server, for instance 
>   reporting status.
> 
> 3) Debugging the Map/Reduce task itself.  Logging.  Judicious logging is 
> most often what I use.  Also do very small example if you can help it to 
> give yourself small turnaround times.  Unless your problem is occurring 
> only on a large dataset, don't debug on a large data set.
> 
> Hope this helps.
> 
> Dennis
> 
>>        
>>         thank you !!!  
> 
> 

-- 
View this message in context: 
http://www.nabble.com/hi-%3Athe-question-about-develop-environment%21-tp23170026p23191120.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: hi Kubes:the question about develop environment!

Reply via email to