Hi,

This is an issue. Below is the code of SolrDeleteDuplicate class from nutch
1.7 trunk where the solr record is deleted by id field. As documents don't
have the url field therefore the id of the documents empty, so its throwing
a null pointer exception when it runs.

Now i am writing on my phone. i diş not find this issue. But if you update
from 1.7 to newer version. You will not get this error.

Talat
On Sep 2, 2014 10:22 AM, <vinay.kash...@socialinfra.net> wrote:

>
>
>
> Hi,
> I have taken nutch 1.7 source and copied
> mapred-site.xml,hdfs-site.xml,yarn-site.xml,hadoop-env.sh,core-site.xml
> from my Hadoop 2.3.0-cdh5.1.0 and did an ant build.
> Then went on to
> runtime/deploy/bin  to start the crawling. it successfully submitted
> the jobs to my yarn. But later during indexing to solr, i'm getting below
> exceptions.
> I have copied the scheme-solr4.xml to my solr and added
> exceptions in regex-urlfilter.txt for a particular website which i give
> for crawling in the directory urls/seed.txt.
> Error:
> java.lang.NullPointerException
>
>                 at
> org.apache.hadoop.io.Text.encode(Text.java:443)
>
>                 at
> org.apache.hadoop.io.Text.set(Text.java:198)
>
>                 at
>
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)
>
>                 at
>
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
>
>                 at
>
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
>
>                 at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
>
>                 at
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>
>                 at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>
>                 at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>
>                 at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>
>                 at
> java.security.AccessController.doPrivileged(Native Method)
>
>                 at
> javax.security.auth.Subject.doAs(Subject.java:415)
>
>                 at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>
>                 at
> org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>
>
>
>         Kindly, can any one tell me how to solve this issue? I'm basically
> stuck
> here!!
>
>

Reply via email to