Hi Micheal, What do you have in your Solr logs?
Kind Regards, Furkan KAMACI 2 May 2017 Sal, saat 02:45 tarihinde Michael Coffey <[email protected]> şunu yazdı: > I know this might be more of a SOLR question, but I bet some of you know > the answer. > > I've been using Nutch1.12 + SOLR 5.4.1 successfully for several weeks, but > suddenly I am having frequent problems. My recent changes have been (1) > indexing two segments at a time, instead of one, and (2) indexing larger > segments than before. > > The segments are still not terribly large, just 24000 each, for a total of > 48000 in the two-segment job. > > Here is the exception I get > 17/05/01 07:29:34 INFO mapreduce.Job: map 100% reduce 67% > 17/05/01 07:29:42 INFO mapreduce.Job: Task Id : > attempt_1491521848897_3507_r_000000_2, Status : FAILED > Error: > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at http://coderox.xxx.com:8984/solr/popular: Exception > writing document id http://0-0.ooo/ to the index; possible analysis error. > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:241) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:230) > at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220) > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.push(SolrIndexWriter.java:209) > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:173) > at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:85) > at > org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50) > at > org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41) > at > org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:493) > at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:367) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > > > Of course, the document URL is different each time. > > It looks to me like it's complaining about an individual document. This is > surprising because it didn't happen at all for the first two million > documents I indexed. > > Have you nay suggestions on how to debug this? Or how to make it ignore > occasional single-document errors without freaking out?? >

