I have Nutch 1.13 and Solr 6.6.0 working, as long as Solr is in standalone
mode.

On Fri, Jun 9, 2017 at 11:07 AM, Witney, Ernest <[email protected]> wrote:

> What version of SOLR and Nutch were you able to get to work?
>
>
>
>
>
>
> > On Jun 9, 2017, at 10:24 AM, David Parker <[email protected]> wrote:
> >
> > Just to follow up on this, I never did get this to work.  I ended up
> > reverting to a standalone Solr instance without authentication, and it
> > works.  It would certainly be nice to have this working with SolrCloud
> and
> > ZK, though.
> >
> > Thanks!
> >
> > On Wed, Jun 7, 2017 at 5:45 PM, David Parker <[email protected]> wrote:
> >
> >> I saw that while I was Googling this issue.  That conversation made it
> >> sound like this would be fixed in Nutch 1.12, and I'm using 1.13.
> >> Shouldn't that fix be in this version?
> >>
> >> On Jun 7, 2017 4:32 PM, "Furkan KAMACI" <[email protected]> wrote:
> >>
> >>> *PS:* Similar conversation:
> >>> http://lucene.472066.n3.nabble.com/Nutch-with-
> Solrcloud-5-td4248700.html
> >>>
> >>> On Wed, Jun 7, 2017 at 9:52 PM, David Parker <[email protected]>
> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I am attempting to integrate Nutch 1.13 with Solr 6.6 running in cloud
> >>>> mode.  I previously had this working fine with Nutch 1.13 and Solr 6.5
> >>>> running in stand-alone mode, but now I get an error.  It seems to be
> an
> >>>> issue with the collection not being default.
> >>>>
> >>>> Command:
> >>>>
> >>>> bin/nutch index -Dsolr.zookeeper.hosts=localhost:9983
> >>>> -Dsolr.auth.password=xxxxxxxx -Dsolr.auth.username=xxxxxxxx
> >>>> -Dsolr.auth=true -Dsolr.server.url=http://local
> >>> host:8983/solr/uc_website
> >>>> crawl/crawldb -linkdb crawl/linkdb crawl/segments/20170607135140
> >>>>
> >>>> Result in hadoop.log:
> >>>>
> >>>> java.lang.Exception: java.io.IOException
> >>>>        at
> >>>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(
> >>>> LocalJobRunner.java:462)
> >>>>        at
> >>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunn
> >>> er.java:529)
> >>>> Caused by: java.io.IOException
> >>>>        at
> >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException(
> >>>> SolrIndexWriter.java:234)
> >>>>        at
> >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.push(
> >>>> SolrIndexWriter.java:213)
> >>>>        at
> >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(
> >>>> SolrIndexWriter.java:174)
> >>>>        at org.apache.nutch.indexer.IndexWriters.write(
> >>>> IndexWriters.java:87)
> >>>>        at
> >>>> org.apache.nutch.indexer.IndexerOutputFormat$1.write(
> >>>> IndexerOutputFormat.java:50)
> >>>>        at
> >>>> org.apache.nutch.indexer.IndexerOutputFormat$1.write(
> >>>> IndexerOutputFormat.java:41)
> >>>>        at
> >>>> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(
> >>>> ReduceTask.java:493)
> >>>>        at
> >>>> org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422)
> >>>>        at
> >>>> org.apache.nutch.indexer.IndexerMapReduce.reduce(
> >>>> IndexerMapReduce.java:368)
> >>>>        at
> >>>> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapR
> >>> educe.java:57)
> >>>>        at
> >>>> org.apache.hadoop.mapred.ReduceTask.runOldReducer(
> ReduceTask.java:444)
> >>>>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.
> java:392)
> >>>>        at
> >>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(
> >>>> LocalJobRunner.java:319)
> >>>>        at
> >>>> java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
> >>>>        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>>>        at
> >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(
> >>>> ThreadPoolExecutor.java:1142)
> >>>>        at
> >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> >>>> ThreadPoolExecutor.java:617)
> >>>>        at java.lang.Thread.run(Thread.java:745)
> >>>> Caused by: org.apache.solr.client.solrj.SolrServerException: No
> >>> collection
> >>>> param specified on request and no default collection has been set.
> >>>>        at
> >>>> org.apache.solr.client.solrj.impl.CloudSolrClient.
> >>>> directUpdate(CloudSolrClient.java:556)
> >>>>        at
> >>>> org.apache.solr.client.solrj.impl.CloudSolrClient.
> >>>> sendRequest(CloudSolrClient.java:981)
> >>>>        at
> >>>> org.apache.solr.client.solrj.impl.CloudSolrClient.
> >>>> requestWithRetryOnStaleState(CloudSolrClient.java:870)
> >>>>        at
> >>>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> >>>> CloudSolrClient.java:806)
> >>>>        at
> >>>> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
> >>>>        at
> >>>> org.apache.nutch.indexwriter.solr.SolrIndexWriter.push(
> >>>> SolrIndexWriter.java:210)
> >>>>        ... 16 more
> >>>> 2017-06-07 14:42:32,305 ERROR indexer.IndexingJob - Indexer:
> >>>> java.io.IOException: Job failed!
> >>>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:
> >>> 865)
> >>>>        at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.
> >>>> java:147)
> >>>>        at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:
> >>> 230)
> >>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >>>>        at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:
> >>> 239)
> >>>>
> >>>> I think the root of the problem is the line "No collection param
> >>> specified
> >>>> on request and no default collection has been set."
> >>>>
> >>>> Any help is greatly appreciated.  Thanks!
> >>>>
> >>>> --
> >>>> Dave Parker
> >>>> Database & Systems Administrator
> >>>> Utica College
> >>>> Integrated Information Technology Services
> >>>> (315) 792-3229
> >>>> Registered Linux User #408177
> >>>>
> >>>
> >>
> >
> >
> > --
> > Dave Parker
> > Database & Systems Administrator
> > Utica College
> > Integrated Information Technology Services
> > (315) 792-3229
> > Registered Linux User #408177
>
>


-- 
Dave Parker
Database & Systems Administrator
Utica College
Integrated Information Technology Services
(315) 792-3229
Registered Linux User #408177

Reply via email to