Ho Michael, Good to hear my suggestion helped you!
Kind Regards, Furkan KAMACI On Wed, May 3, 2017 at 8:55 PM, Michael Coffey <[email protected]> wrote: > Thank you, Furkan, for the excellent suggestion. > > Once I found the solr logs, it was not too hard to discover that there > were OutOfMemory exceptions neighboring the "possible analysis" errors. I > was able to fix it by boosting the Java heap size (not easy to do using > docker). Blaming solr for the misleading messages! > > > > Hi Michael, > > What do you have in your Solr logs? > > Kind Regards, > Furkan KAMACI > > > 2 May 2017 Sal, saat 02:45 tarihinde Michael Coffey > <[email protected]> şunu yazdı: > > > I know this might be more of a SOLR question, but I bet some of you know > > the answer. > > > > I've been using Nutch1.12 + SOLR 5.4.1 successfully for several weeks, > but > > suddenly I am having frequent problems. My recent changes have been (1) > > indexing two segments at a time, instead of one, and (2) indexing larger > > segments than before. > > > > The segments are still not terribly large, just 24000 each, for a total > of > > 48000 in the two-segment job. > > > > Here is the exception I get > > 17/05/01 07:29:34 INFO mapreduce.Job: map 100% reduce 67% > > 17/05/01 07:29:42 INFO mapreduce.Job: Task Id : > > attempt_1491521848897_3507_r_000000_2, Status : FAILED > > Error: > > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: > Error > > from server at http://coderox.xxx.com:8984/solr/popular: Exception > > writing document id http://0-0.ooo/ to the index; possible analysis > error. > > at > > org.apache.solr.client.solrj.impl.HttpSolrClient. > executeMethod(HttpSolrClient.java:575) > > at > > org.apache.solr.client.solrj.impl.HttpSolrClient.request( > HttpSolrClient.java:241) > > at > > org.apache.solr.client.solrj.impl.HttpSolrClient.request( > HttpSolrClient.java:230) > > at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220) > > at > > org.apache.nutch.indexwriter.solr.SolrIndexWriter.push( > SolrIndexWriter.java:209) > > at > > org.apache.nutch.indexwriter.solr.SolrIndexWriter.write( > SolrIndexWriter.java:173) > > at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:85) > > at > > org.apache.nutch.indexer.IndexerOutputFormat$1.write( > IndexerOutputFormat.java:50) > > at > > org.apache.nutch.indexer.IndexerOutputFormat$1.write( > IndexerOutputFormat.java:41) > > at > > org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write( > ReduceTask.java:493) > > at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422) > > at > > org.apache.nutch.indexer.IndexerMapReduce.reduce( > IndexerMapReduce.java:367) > > at > > org.apache.nutch.indexer.IndexerMapReduce.reduce( > IndexerMapReduce.java:56) > > at org.apache.hadoop.mapred.ReduceTask.runOldReducer( > ReduceTask.java:444) > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > > org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1657) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > > > > > > Of course, the document URL is different each time. > > > > It looks to me like it's complaining about an individual document. This > is > > surprising because it didn't happen at all for the first two million > > documents I indexed. > > > > Have you nay suggestions on how to debug this? Or how to make it ignore > > occasional single-document errors without freaking out?? > > >

