I am trying to configure nutch 1.4 with solr 3.4.

I configured everything and when I run the command:

./nutch crawl urls -dir myCrawl2 -solr http://localhost:8080 -depth 2 -topN
2

I get the following error:

java.io.IOException: Job failed!
SolrDeleteDuplicates: starting at 2013-06-06 15:49:30
SolrDeleteDuplicates: Solr url: http://localhost:8080
Exception in thread "main" java.io.IOException:
org.apache.solr.client.solrj.SolrServerException: Error executing query
        at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getSplits(SolrDeleteDuplicates.java:200)
        at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
        at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373)
        at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353)
        at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error
executing query
        at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
        at
org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
        at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getSplits(SolrDeleteDuplicates.java:198)
        ... 9 more
Caused by: org.apache.solr.common.SolrException: Not Found

Not Found

request: http://localhost:8080/select?q=id:[* TO
*]&fl=id&rows=1&wt=javabin&version=2
        at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
        at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
        at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
        ... 11 more


Other possibly helpful information:
1) The solr admin screen comes up fine in the browser.
2) I copied the schema.xml file that came with nutch into my solr core conf
directory
3) Again, nutch will run and crawl everything it's just that when it comes
time to post it to SOLR it throws this error.

I have configured everything I can think of, checked logs, and scoured the
Internet and have not been able to find a solution. If anybody has any
ideas on how I can resolve this I would be incredibly grateful.

Reply via email to