I know this might be more of a SOLR question, but I bet some of you know the answer.
I've been using Nutch1.12 + SOLR 5.4.1 successfully for several weeks, but suddenly I am having frequent problems. My recent changes have been (1) indexing two segments at a time, instead of one, and (2) indexing larger segments than before. The segments are still not terribly large, just 24000 each, for a total of 48000 in the two-segment job. Here is the exception I get 17/05/01 07:29:34 INFO mapreduce.Job: map 100% reduce 67% 17/05/01 07:29:42 INFO mapreduce.Job: Task Id : attempt_1491521848897_3507_r_000000_2, Status : FAILED Error: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://coderox.xxx.com:8984/solr/popular: Exception writing document id http://0-0.ooo/ to the index; possible analysis error. at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:241) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:230) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220) at org.apache.nutch.indexwriter.solr.SolrIndexWriter.push(SolrIndexWriter.java:209) at org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:173) at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:85) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41) at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:493) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422) at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:367) at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Of course, the document URL is different each time. It looks to me like it's complaining about an individual document. This is surprising because it didn't happen at all for the first two million documents I indexed. Have you nay suggestions on how to debug this? Or how to make it ignore occasional single-document errors without freaking out??

