Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "RunningNutchAndSolr" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/RunningNutchAndSolr?action=diff&rev1=62&rev2=63

  
  From now on, we am going to use ${NUTCH_RUNTIME_HOME} to refer to the current 
directory.
  
- '''2.''' Verify your Nutch installation:
+ == 2. Verify your Nutch installation ==
+  
   * run "bin/nutch" - You can confirm a correct installation if you seeing the 
following:
  {{{
  Usage: nutch [-core] COMMAND
@@ -43, +44 @@

  export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home
  }}}
  
- '''3.''' Crawl your first website:
+ == 3. Crawl your first website ==
+ 
   *  Add your agent name in the value field of the http.agent.name property in 
conf/nutch-site.xml, for example:
  {{{
  <property>
@@ -73, +75 @@

  }}}
  If not then please read on for how to set up your Solr instance and index 
your crawl data.
  
- '''4a.''' Setup Solr for search from binary distribution:
+ == 4a. Setup Solr for search from binary distribution ==
+ 
   * download binary file from 
[[http://www.apache.org/dyn/closer.cgi/lucene/solr/|here]]
   * unzip to $HOME/apache-solr-3.X, we will now refer to this as 
${APACHE_SOLR_HOME}
   * cd ${APACHE_SOLR_HOME}/example
   * java -jar start.jar
  
- '''4b.''' Setup Solr for search from source distribution:
+ == 4b. Setup Solr for search from source distribution ==
+ 
   * You can setup Solr from source distribution with Maven. This 
[[http://thetechietutorials.blogspot.com/2011/06/how-to-build-and-start-apache-solr.html|link]]
 shows how to do that.
  
- '''5.''' Verify Solr installation:
+ == 5. Verify Solr installation ==
+ 
  After you started Solr admin console, you should be able to access the 
following links:
  {{{
  http://localhost:8983/solr/admin/
  http://localhost:8983/solr/admin/stats.jsp
  }}}
  
- '''6.''' Integrate Solr with Nutch
+ == 6. Integrate Solr with Nutch ==
+ 
  We have both Nutch and Solr installed and setup correctly. And Nutch already 
created crawl data from the seed url(s). Below are the steps to delagte 
searching to Solr for links to be searchable:
   * cp ${NUTCH_RUNTIME_HOME}/conf/schema.xml 
${APACHE_SOLR_HOME}/example/solr/conf/ 
   * restart Solr with the command “java -jar start.jar” under 
${APACHE_SOLR_HOME}/example 

Reply via email to