SolrDedup broken
----------------

                 Key: NUTCH-1100
                 URL: https://issues.apache.org/jira/browse/NUTCH-1100
             Project: Nutch
          Issue Type: Bug
          Components: indexer
    Affects Versions: 1.4
            Reporter: Markus Jelsma
             Fix For: 1.4


Some Solr indices are unable to be deduped from Nutch. For unknown reasons 
Nutch will throw the exception below. There are no peculiarities to be found in 
the Solr logs, the queries are normal and seem to succeed.

{code}
java.lang.NullPointerException
        at org.apache.hadoop.io.Text.encode(Text.java:388)
        at org.apache.hadoop.io.Text.set(Text.java:178)
        at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:272)
        at 
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:243)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to