I use the -solr and almost exclusively because it cuts out a lot of steps in 
the crawl process. Your command line would look something like bin/nutch crawl 
urls -depth 10 -threads 10 -topN 10 -solr http://localhost:8983/solr

Nutch and Solr will both tell you if there are errors which usually have to do 
with field mismatching.

Adam

Sent from my iPhone

On Feb 10, 2011, at 5:27 AM, "McGibbney, Lewis John" 
<[email protected]> wrote:

> Hi Abi,
> 
> Nutch uses Lucene as the default mechanism when running the crawl command. I 
> would be surprised if you did not receive some sort of error message when 
> attempting to add a Solr param to a Nutch crawl.
> 
> If you follow one of the online tutorials available you will find that final 
> stage (solrindex) is a separate command
> 
> Lewis
> ________________________________________
> From: .: Abishek :. [[email protected]]
> Sent: 10 February 2011 03:18
> To: [email protected]
> Subject: -solr parameter in Crawl
> 
> Hi all,
> 
> When do we use the -solr param for the nutch crawl? And is it a mandate
> that solr should be running in the solr URL passed in the -solr?
> 
> Should I be using it as,
> 
> bin/nutch crawl ..... -solr http://127.0.0.1:8983/solr
> 
> or
> 
> bin/nutch crawl -solr http://127.0.0.1:8983/solr <crawldb> <linkdb>
> <segments>
> 
> Thanks for your time.
> 
> Cheers,
> Abi
> 
> Email has been scanned for viruses by Altman Technologies' email management 
> service - www.altman.co.uk/emailsystems
> 
> Glasgow Caledonian University is a registered Scottish charity, number 
> SC021474
> 
> Winner: Times Higher Education’s Widening Participation Initiative of the 
> Year 2009 and Herald Society’s Education Initiative of the Year 2009.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html
> 
> Winner: Times Higher Education’s Outstanding Support for Early Career 
> Researchers of the Year 2010, GCU as a lead with Universities Scotland 
> partners.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

Reply via email to