Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The following page has been changed by RidaBenjelloun:
http://wiki.apache.org/nutch/RunningNutchAndSolr
--
1. Go into the solr-trunk and run 'ant dist dist-solrj'
1. Get zip from [http://variogram.com/latest/SolrIndexer.zip Variogr.am]
and unzip it to solr-trunk
1. Copy apache-solr-solrj-1.3-dev.jar and apache-solr-common-1.3-dev.jar to
nutch-trunk/lib
- 1. Get the zip file from
[http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html|
FooFactory] for SOLR-20
+ 1. Get the zip file from
[http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html
FooFactory] for SOLR-20
1. Unzip solr-client.zip somewhere, go into java/solr/src and run 'ant'
1. Copy solr-client.jar from dist to nutch-trunk/lib
1. Copy xpp3-1.1.3.4.0.jar from lib to nutch-trunk/lib
- 1. Get SolrClientAdapter.java from
[http://www.foofactory.fi/files/nutch-solr/nutch_solr.patch| FooFactory patch]
and copy it to nutch-trunk/src/java/org/apache/nutch/indexer
+ 1. Get SolrClientAdapter.java from
[http://www.foofactory.fi/files/nutch-solr/nutch_solr.patch FooFactory patch]
and copy it to nutch-trunk/src/java/org/apache/nutch/indexer
* Edit nutch-trunk/src/java/org/apache/nutch/indexer/SolrIndexer.java:
* Replace int res = new SolrIndexer().doMain(NutchConfiguration.create(),
args); with int res = ToolRunner.run(NutchConfiguration.create(), new
SolrIndexer(), args);
1. Edit the imports to pick up ToolRunner
1. Edit nutch-trunk/src/java/org/apache/nutch/indexer/Indexer.java changing
scope on LuceneDocumentWrapper from private to protected
1. Configure nutch-trunk/conf/nutch-site.xml with settings for your site
including a value for property indexer.solr.url (something like
http://localhost:8983/solr/)
1. Configure some url(s) to crawl (files in a urls directory)
- 1. Copy [http://www.foofactory.fi/files/nutch-solr/crawl.sh| Crawl.sh
script] from FooFactory and copy it to nutch-trunk/bin (editing if needed)
+ 1. Copy [http://www.foofactory.fi/files/nutch-solr/crawl.sh Crawl.sh script]
from FooFactory and copy it to nutch-trunk/bin (editing if needed)
1. Start a Solr server (such as the solr-trunk/example instance)
1. Run a Nutch crawl using the bin/crawl.sh script.