I have Nutch 1.13 and Solr 6.6.0 working, as long as Solr is in standalone mode.
On Fri, Jun 9, 2017 at 11:07 AM, Witney, Ernest <[email protected]> wrote: > What version of SOLR and Nutch were you able to get to work? > > > > > > > > On Jun 9, 2017, at 10:24 AM, David Parker <[email protected]> wrote: > > > > Just to follow up on this, I never did get this to work. I ended up > > reverting to a standalone Solr instance without authentication, and it > > works. It would certainly be nice to have this working with SolrCloud > and > > ZK, though. > > > > Thanks! > > > > On Wed, Jun 7, 2017 at 5:45 PM, David Parker <[email protected]> wrote: > > > >> I saw that while I was Googling this issue. That conversation made it > >> sound like this would be fixed in Nutch 1.12, and I'm using 1.13. > >> Shouldn't that fix be in this version? > >> > >> On Jun 7, 2017 4:32 PM, "Furkan KAMACI" <[email protected]> wrote: > >> > >>> *PS:* Similar conversation: > >>> http://lucene.472066.n3.nabble.com/Nutch-with- > Solrcloud-5-td4248700.html > >>> > >>> On Wed, Jun 7, 2017 at 9:52 PM, David Parker <[email protected]> > wrote: > >>> > >>>> Hello, > >>>> > >>>> I am attempting to integrate Nutch 1.13 with Solr 6.6 running in cloud > >>>> mode. I previously had this working fine with Nutch 1.13 and Solr 6.5 > >>>> running in stand-alone mode, but now I get an error. It seems to be > an > >>>> issue with the collection not being default. > >>>> > >>>> Command: > >>>> > >>>> bin/nutch index -Dsolr.zookeeper.hosts=localhost:9983 > >>>> -Dsolr.auth.password=xxxxxxxx -Dsolr.auth.username=xxxxxxxx > >>>> -Dsolr.auth=true -Dsolr.server.url=http://local > >>> host:8983/solr/uc_website > >>>> crawl/crawldb -linkdb crawl/linkdb crawl/segments/20170607135140 > >>>> > >>>> Result in hadoop.log: > >>>> > >>>> java.lang.Exception: java.io.IOException > >>>> at > >>>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks( > >>>> LocalJobRunner.java:462) > >>>> at > >>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunn > >>> er.java:529) > >>>> Caused by: java.io.IOException > >>>> at > >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException( > >>>> SolrIndexWriter.java:234) > >>>> at > >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.push( > >>>> SolrIndexWriter.java:213) > >>>> at > >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.write( > >>>> SolrIndexWriter.java:174) > >>>> at org.apache.nutch.indexer.IndexWriters.write( > >>>> IndexWriters.java:87) > >>>> at > >>>> org.apache.nutch.indexer.IndexerOutputFormat$1.write( > >>>> IndexerOutputFormat.java:50) > >>>> at > >>>> org.apache.nutch.indexer.IndexerOutputFormat$1.write( > >>>> IndexerOutputFormat.java:41) > >>>> at > >>>> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write( > >>>> ReduceTask.java:493) > >>>> at > >>>> org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422) > >>>> at > >>>> org.apache.nutch.indexer.IndexerMapReduce.reduce( > >>>> IndexerMapReduce.java:368) > >>>> at > >>>> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapR > >>> educe.java:57) > >>>> at > >>>> org.apache.hadoop.mapred.ReduceTask.runOldReducer( > ReduceTask.java:444) > >>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask. > java:392) > >>>> at > >>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run( > >>>> LocalJobRunner.java:319) > >>>> at > >>>> java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >>>> at > >>>> java.util.concurrent.ThreadPoolExecutor.runWorker( > >>>> ThreadPoolExecutor.java:1142) > >>>> at > >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run( > >>>> ThreadPoolExecutor.java:617) > >>>> at java.lang.Thread.run(Thread.java:745) > >>>> Caused by: org.apache.solr.client.solrj.SolrServerException: No > >>> collection > >>>> param specified on request and no default collection has been set. > >>>> at > >>>> org.apache.solr.client.solrj.impl.CloudSolrClient. > >>>> directUpdate(CloudSolrClient.java:556) > >>>> at > >>>> org.apache.solr.client.solrj.impl.CloudSolrClient. > >>>> sendRequest(CloudSolrClient.java:981) > >>>> at > >>>> org.apache.solr.client.solrj.impl.CloudSolrClient. > >>>> requestWithRetryOnStaleState(CloudSolrClient.java:870) > >>>> at > >>>> org.apache.solr.client.solrj.impl.CloudSolrClient.request( > >>>> CloudSolrClient.java:806) > >>>> at > >>>> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219) > >>>> at > >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.push( > >>>> SolrIndexWriter.java:210) > >>>> ... 16 more > >>>> 2017-06-07 14:42:32,305 ERROR indexer.IndexingJob - Indexer: > >>>> java.io.IOException: Job failed! > >>>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java: > >>> 865) > >>>> at org.apache.nutch.indexer.IndexingJob.index(IndexingJob. > >>>> java:147) > >>>> at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java: > >>> 230) > >>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > >>>> at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java: > >>> 239) > >>>> > >>>> I think the root of the problem is the line "No collection param > >>> specified > >>>> on request and no default collection has been set." > >>>> > >>>> Any help is greatly appreciated. Thanks! > >>>> > >>>> -- > >>>> Dave Parker > >>>> Database & Systems Administrator > >>>> Utica College > >>>> Integrated Information Technology Services > >>>> (315) 792-3229 > >>>> Registered Linux User #408177 > >>>> > >>> > >> > > > > > > -- > > Dave Parker > > Database & Systems Administrator > > Utica College > > Integrated Information Technology Services > > (315) 792-3229 > > Registered Linux User #408177 > > -- Dave Parker Database & Systems Administrator Utica College Integrated Information Technology Services (315) 792-3229 Registered Linux User #408177

