On 30/01/2012 12:38, Denis Sinner wrote:
crawl:

bin/nutch crawl urls -solr http://127.0.0.1:8080/solr_3-5/searchDkdDe_en -depth 
1

fails at indexing, then

bin/nutch solrindex http://127.0.0.1:8080/solr_3-5/searchDkdDe_en 
crawl-20120130123251/crawldb/ -linkdb crawl-20120130123251/linkdb/ 
crawl-20120130123251/segments/*

works!

btw. in the logs, when crawl requests the solr server, the url is lowercased, 
where in solrindex it is not.

Ah, indeed - looks like a bug in the Crawl command. See line 87 of Crawl.java - just replace it with a simple assignment, like this:

        solrUrl = args[i + 1];

And rebuild. I'll file a JIRA issue and will fix it in Nutch trunk. Thanks!


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to