Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by NickTkach:
http://wiki.apache.org/nutch/RunningNutchAndSolr

------------------------------------------------------------------------------
  
   1. Check out solr-trunk and nutch-trunk
   1. Go into the solr-trunk and run 'ant dist dist-solrj'
-  1. Get zip from [http://variogram.com/latest/SolrIndexer.zip|Variogr.am] and 
unzip it to solr-trunk
+  1. Get zip from [http://variogram.com/latest/SolrIndexer.zip| Variogr.am] 
and unzip it to solr-trunk
   1. Copy apache-solr-solrj-1.3-dev.jar and apache-solr-common-1.3-dev.jar to 
nutch-trunk/lib
-  1. Get the zip file from 
[http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html|FooFactory]
 for SOLR-20
+  1. Get the zip file from 
[http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html| 
FooFactory] for SOLR-20
   1. Unzip solr-client.zip somewhere, go into java/solr/src and run 'ant'
   1. Copy solr-client.jar from dist to nutch-trunk/lib
   1. Copy xpp3-1.1.3.4.0.jar from lib to nutch-trunk/lib
-  1. Get SolrClientAdapter.java from 
[http://www.foofactory.fi/files/nutch-solr/nutch_solr.patch|FooFactory patch] 
and copy it to nutch-trunk/src/java/org/apache/nutch/indexer
+  1. Get SolrClientAdapter.java from 
[http://www.foofactory.fi/files/nutch-solr/nutch_solr.patch| FooFactory patch] 
and copy it to nutch-trunk/src/java/org/apache/nutch/indexer
     * Edit nutch-trunk/src/java/org/apache/nutch/indexer/SolrIndexer.java:
     * Replace int res = new SolrIndexer().doMain(NutchConfiguration.create(), 
args); with int res = ToolRunner.run(NutchConfiguration.create(), new 
SolrIndexer(), args);
   1. Edit the imports to pick up ToolRunner
   1. Edit nutch-trunk/src/java/org/apache/nutch/indexer/Indexer.java changing 
scope on LuceneDocumentWrapper from private to protected
   1. Configure nutch-trunk/conf/nutch-site.xml with settings for your site 
including a value for property indexer.solr.url (something like 
http://localhost:8983/solr/)
   1. Configure some url(s) to crawl (files in a urls directory)
-  1. Copy [http://www.foofactory.fi/files/nutch-solr/crawl.sh|Crawl.sh script] 
from FooFactory and copy it to nutch-trunk/bin (editing if needed)
+  1. Copy [http://www.foofactory.fi/files/nutch-solr/crawl.sh| Crawl.sh 
script] from FooFactory and copy it to nutch-trunk/bin (editing if needed)
   1. Start a Solr server (such as the solr-trunk/example instance)
   1. Run a Nutch crawl using the bin/crawl.sh script.
  

Reply via email to