Re: idexer "possible analysis error"

Furkan KAMACI Mon, 01 May 2017 16:49:28 -0700

Hi Micheal,

What do you have in your Solr logs?


Kind Regards,
Furkan KAMACI


2 May 2017 Sal, saat 02:45 tarihinde Michael Coffey
<[email protected]> şunu yazdı:

> I know this might be more of a SOLR question, but I bet some of you know
> the answer.
>
> I've been using Nutch1.12 + SOLR 5.4.1 successfully for several weeks, but
> suddenly I am having frequent problems. My recent changes have been (1)
> indexing two segments at a time, instead of one, and (2) indexing larger
> segments than before.
>
> The segments are still not terribly large, just 24000 each, for a total of
> 48000 in the two-segment job.
>
> Here is the exception I get
> 17/05/01 07:29:34 INFO mapreduce.Job:  map 100% reduce 67%
> 17/05/01 07:29:42 INFO mapreduce.Job: Task Id :
> attempt_1491521848897_3507_r_000000_2, Status : FAILED
> Error:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://coderox.xxx.com:8984/solr/popular: Exception
> writing document id http://0-0.ooo/ to the index; possible analysis error.
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:241)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:230)
> at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220)
> at
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.push(SolrIndexWriter.java:209)
> at
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.write(SolrIndexWriter.java:173)
> at org.apache.nutch.indexer.IndexWriters.write(IndexWriters.java:85)
> at
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50)
> at
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
> at
> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.write(ReduceTask.java:493)
> at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:422)
> at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:367)
> at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56)
> at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>
>
> Of course, the document URL is different each time.
>
> It looks to me like it's complaining about an individual document. This is
> surprising because it didn't happen at all for the first two million
> documents I indexed.
>
> Have you nay suggestions on how to debug this? Or how to make it ignore
> occasional single-document errors without freaking out??
>

Re: idexer "possible analysis error"

Reply via email to