Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by ra:
http://wiki.apache.org/nutch/RunNutchInEclipse0%2e9

------------------------------------------------------------------------------
   * change the property "plugin.folders" to "./src/plugin" on 
$NUTCH_HOME/conf/nutch-defaul.xml
   * make sure Nutch is configured correctly before testing it into Eclipse ;-)
  
+ === missing org.farng and com.etranslate ===
+ You will encounter problems with some imports in parse-mp3 and parse-rtf 
plugins (30 errors in my case).
+ Because of incompatibility with Apache license they were left from sources. 
+ You can download them here:
+ 
+ http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/
+ 
+ http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/
+ 
+ Copy the jar files into src/plugin/parse-mp3/lib and 
src/plugin/parse-rtf/lib/ respectively.
+ Then add them to the libraries to the build path (First refresh the 
workspace. Then Right click on the source
+ folder => Java Build Path => Libraries => Add Jars).
+ 
+ 
  === Build Nutch ===
   * In case you setup the project correctly, Eclipse will build Nutch for you 
into "tmp_build".
  
  
- -----------------------> okay up to here... going to do rest tomorrow...
  
  
  
@@ -62, +75 @@

   * click on "Run"
   * if all works, you should see Nutch getting busy at crawling :-)
  
- == Debug Nutch in Eclipse ==
+ == Debug Nutch in Eclipse (not yet tested for 0.9) ==
   * Set breakpoints and debug a crawl
   * It can be tricky to find out where to set the breakpoint, because of the 
Hadoop jobs. Here are a few good places to set breakpoints:
  {{{
@@ -78, +91 @@

  Yes, Nutch and Eclipse can be a difficult companionship sometimes ;-)
  
  === eclipse: Cannot create project content in workspace ===
- The nutch source code must be out of the workspace folder. My first attemp 
was download the code with eclipse (svn) under my workspace. When I try to 
create the project using existing code, eclipse don't let me do it from source 
code into the workspace. I use the source code out of my workspace and it work 
fine.
+ The nutch source code must be out of the workspace folder. My first attempt 
was download the code with eclipse (svn) under my workspace. When I try to 
create the project using existing code, eclipse don't let me do it from source 
code into the workspace. I use the source code out of my workspace and it work 
fine.
  
  === plugin dir not found ===
- Make sure you set your plugin.folders property correct, instead of using a 
relative path you can use a absoluth one as well in nutch-defaults.xml or may 
be better in nutch-site.xml
+ Make sure you set your plugin.folders property correct, instead of using a 
relative path you can use a absolute one as well in nutch-defaults.xml or may 
be better in nutch-site.xml
  {{{
  <property>
    <name>plugin.folders</name>
-   <value>/home/....../nutch-0.8/src/plugin</value>
+   <value>/home/....../nutch-0.9/src/plugin</value>
  }}}
  
  
@@ -107, +120 @@

   * open the class itself, rightclick
   * refresh the build dir
  
- === missing org.farng and com.etranslate ===
- You may have problems with some imports in parse-mp3 and parse-rtf plugins. 
Because of incompatibility with apache licence they were left from sources. You 
can find it here:
- 
- http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/
- 
- http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/
- 
- You need to copy jar files into plugin "lib" path and refresh the project. 
- 
  
  === debugging hadoop classes ===
   Sometime it makes sense to also have the hadoop classes available during 
debugging. So, you can check out the Hadoop sources on your machine and add the 
sources to the  hadoop-xxx.jar. Alternatively, you can: 

Reply via email to