Is there any current workaround? Solr4? Does Nutch 2 work with Solr 5 cloud?


-----Original Message-----
From: Markus Jelsma [mailto:[email protected]] 
Sent: Tuesday, January 5, 2016 11:22 AM
To: [email protected]
Subject: RE: Nutch with Solrcloud 5

This is not going to work indeed. It asks for a missing param: "No collection 
param specified on request and no default collection has been set.". But there 
is no way to set it! The cloud indexer doesn't work either due to some API 
changes. I'll ask my colleague to upload a patch, we have it up and running.

Markus 
 
-----Original message-----
> From:Corey, Stephen <[email protected]>
> Sent: Tuesday 5th January 2016 17:13
> To: [email protected]
> Subject: Nutch with Solrcloud 5
> 
> Has anyone gotten Nutch (preferably 1.11, but any version would be fine) to 
> index data to Solr 5 running in cloud mode? I keep getting the message:
> 
> Indexer: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>         at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>         at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:222)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:231)
> 
> 
> And in my Hadoop.log, I see:
> 
> ....
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.client.solrj.SolrServerException: No collection 
> param specified on request and no default collection has been set.
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:292)
>         at 
> org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:533)
>         at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
>         at 
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:153)
>         ... 11 more
> 
> 
> I am definitely specifying the collection name in the URL. I normally use the 
> bin/crawl command, but I can also replicate this by the individual command:
> 
> bin/nutch index -Dsolr.server.url=http://localhost/solr/gettingstarted 
> -Dsolr.server.type=cloud -Dsolr.zookeeper.url=localhost:9983 ecutest/crawldb 
> -linkdb ecutest/linkdb ecutest/segments/20160104103038
> 
> 
> Any ideas?
> 

Reply via email to