Re: idexer "possible analysis error"

Furkan KAMACI Wed, 03 May 2017 11:46:52 -0700

Ho Michael,

Good to hear my suggestion helped you!


Kind Regards,
Furkan KAMACI

On Wed, May 3, 2017 at 8:55 PM, Michael Coffey <[email protected]>
wrote:

> Thank you, Furkan, for the excellent suggestion.
>
> Once I found the solr logs, it was not too hard to discover that there
> were OutOfMemory exceptions neighboring the "possible analysis" errors. I
> was able to fix it by boosting the Java heap size (not easy to do using
> docker). Blaming solr for the misleading messages!
>
>
>
> Hi Michael,
>
> What do you have in your Solr logs?
>
> Kind Regards,
> Furkan KAMACI
>
>
> 2 May 2017 Sal, saat 02:45 tarihinde Michael Coffey
> <[email protected]> şunu yazdı:
>
> > I know this might be more of a SOLR question, but I bet some of you know
> > the answer.
> >
> > I've been using Nutch1.12 + SOLR 5.4.1 successfully for several weeks,
> but
> > suddenly I am having frequent problems. My recent changes have been (1)
> > indexing two segments at a time, instead of one, and (2) indexing larger
> > segments than before.
> >
> > The segments are still not terribly large, just 24000 each, for a total
> of
> > 48000 in the two-segment job.
> >
> > Here is the exception I get
> > 17/05/01 07:29:34 INFO mapreduce.Job:  map 100% reduce 67%
> > 17/05/01 07:29:42 INFO mapreduce.Job: Task Id :
> > attempt_1491521848897_3507_r_000000_2, Status : FAILED
> > Error:
> > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error
> > from server at http://coderox.xxx.com:8984/solr/popular: Exception
> > writing document id http://0-0.ooo/ to the index; possible analysis
> error.
> > at
> > org.apache.solr.client.solrj.impl.HttpSolrClient.
> executeMethod(HttpSolrClient.java:575)
> > at
> > org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> HttpSolrClient.java:241)
> > at
> > org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> HttpSolrClient.java:230)
> > at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220)
> > at
> > org.apache.nutch.indexwriter.solr.SolrIndexWriter.push(
> SolrIndexWriter.java:209)
> > at
> > org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(
> SolrIndexWriter.java:173)
> > at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:85)
> > at
> > org.apache.nutch.indexer.IndexerOutputFormat$1.write(
> IndexerOutputFormat.java:50)
> > at
> > org.apache.nutch.indexer.IndexerOutputFormat$1.write(
> IndexerOutputFormat.java:41)
> > at
> > org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(
> ReduceTask.java:493)
> > at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422)
> > at
> > org.apache.nutch.indexer.IndexerMapReduce.reduce(
> IndexerMapReduce.java:367)
> > at
> > org.apache.nutch.indexer.IndexerMapReduce.reduce(
> IndexerMapReduce.java:56)
> > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(
> ReduceTask.java:444)
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> > org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1657)
> > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> >
> >
> > Of course, the document URL is different each time.
> >
> > It looks to me like it's complaining about an individual document. This
> is
> > surprising because it didn't happen at all for the first two million
> > documents I indexed.
> >
> > Have you nay suggestions on how to debug this? Or how to make it ignore
> > occasional single-document errors without freaking out??
> >
>

Re: idexer "possible analysis error"

Reply via email to