I am using solr 5.4 and nutch 1.11 On Tue, Jan 19, 2016 at 1:46 AM, Markus Jelsma <[email protected]> wrote:
> Hi - it was an answer to your question whether i have ever used it. Yes, i > patched and committed it. And therefore i asked if you're using Solr 5 or > not. So again, are you using Solr 5? > > Markus > > > -----Original message----- > From: Zara Parst<[email protected]> > Sent: Monday 18th January 2016 16:16 > To: [email protected] > Subject: Re: Nutch/Solr communication problem > > Mind to share that patch ? > > On Mon, Jan 18, 2016 at 8:28 PM, Markus Jelsma <[email protected] > <mailto:[email protected]>> wrote: > Yes i have used it, i made the damn patch myself years ago, and i used the > same configuration. Command line or config work the same. > > Markus > > -----Original message----- > > From: Zara Parst<[email protected] <mailto:[email protected]>> > > Sent: Monday 18th January 2016 12:55 > > To: [email protected] <mailto:[email protected]> > > Subject: Re: Nutch/Solr communication problem > > Dear Markus, > > Are you just speaking blindly or what ?? My concern is did you ever try > pushing index to solr which is password protected ? If yes can you just > tell me what were the config you used , if you did that in config file then > let me know or if you did through command then please let me know. > > thanks > > On Mon, Jan 18, 2016 at 4:50 PM, Markus Jelsma <[email protected] > <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>>> wrote: > > Hi - This doesnt look like a HTTP basic authentication problem. Are you > running Solr 5.x? > > Markus > > -----Original message----- > > From: Zara Parst<[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> > > Sent: Monday 18th January 2016 11:55 > > To: [email protected] <mailto:[email protected]> <mailto: > [email protected] <mailto:[email protected]>> > > Subject: Re: Nutch/Solr communication problem > > SolrIndexWriter > > solr.server.type : Type of SolrServer to communicate with (default > http however options include cloud, lb and concurrent) > > solr.server.url : URL of the Solr instance (mandatory) > > solr.zookeeper.url : URL of the Zookeeper URL (mandatory if cloud > value for solr.server.type) > > solr.loadbalance.urls : Comma-separated string of Solr server > strings to be used (madatory if lb value for solr.server.type) > > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > > solr.commit.size : buffer size when sending to Solr (default 1000) > > solr.auth : use authentication (default false) > > solr.auth.username : username for authentication > > solr.auth.password : password for authentication > > 2016-01-17 19:19:42,973 INFO indexer.IndexerMapReduce - IndexerMapReduce: > crawldb: crawlDbyah/crawldb > > 2016-01-17 19:19:42,973 INFO indexer.IndexerMapReduce - IndexerMapReduce: > linkdb: crawlDbyah/linkdb > > 2016-01-17 19:19:42,973 INFO indexer.IndexerMapReduce - > IndexerMapReduces: adding segment: crawlDbyah/segments/20160117191906 > > 2016-01-17 19:19:42,975 WARN indexer.IndexerMapReduce - Ignoring linkDb > for indexing, no linkDb found in path: crawlDbyah/linkdb > > 2016-01-17 19:19:43,807 WARN conf.Configuration - > file:/tmp/hadoop-rakesh/mapred/staging/rakesh2114349538/.staging/job_local2114349538_0001/job.xml:an > attempt to override final parameter: > mapreduce.job.end-notification.max.retry.interval; Ignoring. > > 2016-01-17 19:19:43,809 WARN conf.Configuration - > file:/tmp/hadoop-rakesh/mapred/staging/rakesh2114349538/.staging/job_local2114349538_0001/job.xml:an > attempt to override final parameter: > mapreduce.job.end-notification.max.attempts; Ignoring. > > 2016-01-17 19:19:43,963 WARN conf.Configuration - > file:/tmp/hadoop-rakesh/mapred/local/localRunner/rakesh/job_local2114349538_0001/job_local2114349538_0001.xml:an > attempt to override final parameter: > mapreduce.job.end-notification.max.retry.interval; Ignoring. > > 2016-01-17 19:19:43,980 WARN conf.Configuration - > file:/tmp/hadoop-rakesh/mapred/local/localRunner/rakesh/job_local2114349538_0001/job_local2114349538_0001.xml:an > attempt to override final parameter: > mapreduce.job.end-notification.max.attempts; Ignoring. > > 2016-01-17 19:19:44,260 INFO anchor.AnchorIndexingFilter - Anchor > deduplication is: off > > 2016-01-17 19:19:45,128 INFO indexer.IndexWriters - Adding > org.apache.nutch.indexwriter.solr.SolrIndexWriter > > 2016-01-17 19:19:45,148 INFO solr.SolrUtils - Authenticating as: radmin > > 2016-01-17 19:19:45,318 INFO solr.SolrMappingReader - source: content > dest: content > > 2016-01-17 19:19:45,318 INFO solr.SolrMappingReader - source: title dest: > title > > 2016-01-17 19:19:45,318 INFO solr.SolrMappingReader - source: host dest: > host > > 2016-01-17 19:19:45,319 INFO solr.SolrMappingReader - source: segment > dest: segment > > 2016-01-17 19:19:45,319 INFO solr.SolrMappingReader - source: boost dest: > boost > > 2016-01-17 19:19:45,319 INFO solr.SolrMappingReader - source: digest > dest: digest > > 2016-01-17 19:19:45,319 INFO solr.SolrMappingReader - source: tstamp > dest: tstamp > > 2016-01-17 19:19:45,360 INFO solr.SolrIndexWriter - Indexing 2 documents > > 2016-01-17 19:19:45,507 INFO solr.SolrIndexWriter - Indexing 2 documents > > 2016-01-17 19:19:45,526 WARN mapred.LocalJobRunner - > job_local2114349538_0001 > > java.lang.Exception: java.io.IOException > > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) > > Caused by: java.io.IOException > > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException(SolrIndexWriter.java:171) > > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:157) > > at > org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:115) > > at > org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44) > > at > org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:502) > > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:456) > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > > at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: org.apache.solr.client.solrj.SolrServerException: IOException > occured when talking to server at: http://127.0.0.1:8983/solr/yah < > http://127.0.0.1:8983/solr/yah> <http://127.0.0.1:8983/solr/yah < > http://127.0.0.1:8983/solr/yah>> <http://127.0.0.1:8983/solr/yah < > http://127.0.0.1:8983/solr/yah> <http://127.0.0.1:8983/solr/yah < > http://127.0.0.1:8983/solr/yah>>> > > at > org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566) > > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) > > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) > > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) > > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:153) > > ... 11 more > > Caused by: org.apache.http.client.ClientProtocolException > > at > org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) > > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) > > at > org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448) > > ... 15 more > > Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot > retry request with a non-repeatable request entity. > > at > org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:208) > > at > org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195) > > at > org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86) > > at > org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108) > > at > org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) > > ... 19 more > > 2016-01-17 19:19:46,055 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > > On Mon, Jan 18, 2016 at 4:15 PM, Markus Jelsma <[email protected] > <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>> <mailto:[email protected] > <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>>>> wrote: > > Hi - can you post the log output? > > Markus > > -----Original message----- > > From: Zara Parst<[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>> <mailto: > [email protected] <mailto:[email protected]> <mailto: > [email protected] <mailto:[email protected]>>>> > > Sent: Monday 18th January 2016 2:06 > > To: [email protected] <mailto:[email protected]> <mailto: > [email protected] <mailto:[email protected]>> <mailto: > [email protected] <mailto:[email protected]> <mailto: > [email protected] <mailto:[email protected]>>> > > Subject: Nutch/Solr communication problem > > Hi everyone, > > I have situation here, I am using nutch 1.11 and solr 5.4 > > Solr is protected by user name and password I am passing credential to > solr using following command > > bin/crawl -i -Dsolr.server.url=http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc>> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc>>> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc>> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc> <http://localhost:8983/solr/abc < > http://localhost:8983/solr/abc>>>> -D solr.auth=true > -Dsolr.auth.username=xxxx -Dsolr.auth.password=xxx url crawlDbyah 1 > > and always same problem , please help me how to feed data to protected > solr. > > Below is error message. > > Indexer: starting at 2016-01-17 19:01:12 > > Indexer: deleting gone documents: false > > Indexer: URL filtering: false > > Indexer: URL normalizing: false > > Active IndexWriters : > > SolrIndexWriter > > solr.server.type : Type of SolrServer to communicate with (default > http however options include cloud, lb and concurrent) > > solr.server.url : URL of the Solr instance (mandatory) > > solr.zookeeper.url : URL of the Zookeeper URL (mandatory if cloud > value for solr.server.type) > > solr.loadbalance.urls : Comma-separated string of Solr server > strings to be used (madatory if lb value for solr.server.type) > > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > > solr.commit.size : buffer size when sending to Solr (default 1000) > > solr.auth : use authentication (default false) > > solr.auth.username : username for authentication > > solr.auth.password : password for authentication > > Indexing 2 documents > > Indexing 2 documents > > Indexer: java.io.IOException: Job failed! > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > > I also tried username and password in nutch-default.xml but again same > error. Please help me out. > > >

