Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "RunningNutchAndSolr" page has been changed by EricPugh:
http://wiki.apache.org/nutch/RunningNutchAndSolr?action=diff&rev1=66&rev2=67

Comment:
Removed mention of source builds, tutorial only uses binary.

  
  Apache Nutch is an open source web crawler written in Java. By using it, we 
can find web page hyperlinks in an automated manner, reduce lots of maintenance 
work, for example checking broken links, and create a copy of all the visited 
pages for searching over. That’s where Apache Solr comes in. Solr is an open 
source full text search framework, with Solr we can search the visited pages 
from Nutch. Luckily, integration between Nutch and Solr is pretty 
straightforward as explained below.
  
- Apache Nutch release 1.3 has Solr integration embedded, greatly simplifying 
Nutch-Solr integration. It also removes the legacy dependence upon both Apache 
Tomcat for running the old Nutch Web Application and upon Apache Lucene for 
indexing. Just download a 1.3 release from 
[[http://www.apache.org/dyn/closer.cgi/nutch/|here]]. NOTE: You can download 
release 1.3 in either binary or source format, both of which are covered in 
this tutorial.
+ Apache Nutch release 1.3 has Solr integration embedded, greatly simplifying 
Nutch-Solr integration. It also removes the legacy dependence upon both Apache 
Tomcat for running the old Nutch Web Application and upon Apache Lucene for 
indexing. Just download a 1.3 binary release from 
[[http://www.apache.org/dyn/closer.cgi/nutch/|here]].
  
  == Table of Contents ==
  <<TableOfContents(3)>>
   
  == Steps ==
  
- == 1a Setup Nutch from binary distribution ==
+ == 1 Setup Nutch from binary distribution ==
  
   * Unzip your binary Nutch package to $HOME/nutch-1.3
   * cd $HOME/nutch-1.3/runtime/local 
- 
- == 1b. Setup Nutch from source distribution ==
- 
-  * Unzip your source package to $HOME/nutch-1.3-src 
-  * cd $HOME/nutch-1.3-src 
-  * run “ant” command. 
-  * It should generate a directory called $HOME/nutch-1.3-src/runtime. 
-  * cd $HOME/nutch-1.3-src/runtime/local 
  
  From now on, we am going to use ${NUTCH_RUNTIME_HOME} to refer to the current 
directory.
  
@@ -79, +71 @@

  }}}
  If not then please read on for how to set up your Solr instance and index 
your crawl data.
  
- == 4a. Setup Solr for search from binary distribution ==
+ == 4. Setup Solr for search ==
  
   * download binary file from 
[[http://www.apache.org/dyn/closer.cgi/lucene/solr/|here]]
   * unzip to $HOME/apache-solr-3.X, we will now refer to this as 
${APACHE_SOLR_HOME}
   * cd ${APACHE_SOLR_HOME}/example
   * java -jar start.jar
- 
- == 4b. Setup Solr for search from source distribution ==
- 
-  * You can setup Solr from source distribution with Maven. This 
[[http://thetechietutorials.blogspot.com/2011/06/how-to-build-and-start-apache-solr.html|link]]
 shows how to do that.
  
  == 5. Verify Solr installation ==
  

Reply via email to