You wrote wrong. You should write like this 

<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-(regex|validator)|parse-(html|tika|metatags|js
|swf)|index-(basic|anchor|metadata|more)|scoring-opic|urlnormalizer-(pass|r
egex|basic)</value>
</property>

And you write in nutch-site.xml after than you should rebuild with ant clean 
runtime

Talat

[email protected] şunu yazdı:

>Hi Talat,
>No, I am not using url filter-validator plugin. Here is my list of plugins:
>
><property>
>  <name>plugin.includes</name>
>  
><value>protocol-http|urlfilter-regex|parse-(html|tika|metatags|js|swf)|inde
>x-(basic|anchor|metadata|more)|scoring-opic|urlnormalizer-(pass|regex|basic
>)</value>
></property>
>
>
>Do I just need to change this to:
>
><property>
><name>plugin.includes</name>
><value>protocol-http|urlfilter-regex|parse|validator-(html|tika|metatags|js
>|swf)|index-(basic|anchor|metadata|more)|scoring-opic|urlnormalizer-(pass|r
>egex|basic)</value>
></property>
>
>Thank you so much,
>
>
>
>Madhvi
>
>
>
>
>
>
>
>On 11/6/13 1:08 PM, "Talat UYARER" <[email protected]> wrote:
>
>>Hi Madhvi,
>>
>>Can you tell me what is your active plugins in your nutch-site.xml. I am
>>not sure but we have a issue simalar this. if your solr return null, this
>>will because this issue. Please check your solr return data
>>
>>You can look at https://issues.apache.org/jira/browse/NUTCH-1100
>>
>>if yours is same, you should use urlfilter-validator plugin.
>>
>>Urlfilter-validator has lots of benifit.  i told in
>>http://mail-archives.apache.org/mod_mbox/nutch-user/201310.mbox/%3c5265BC2
>>[email protected]%3e
>>
>>Talat
>>
>>[email protected] şunu yazdı:
>>
>>>I am going to start my own thread rather than being under javozzo's
>>>thread :)!
>>>
>>>Hi,
>>>
>>>
>>>I am using Nutch 1.5.1 and Solr 3.6 and having problem with command
>>>SolrDeleteDuplicates. Looking at Hadoop logs: I am getting error:
>>>
>>>java.lang.NullPointerException
>>>at org.apache.hadoop.io.Text.encode(Text.java:388)
>>>at org.apache.hadoop.io.Text.set(Text.java:178)
>>>at
>>>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next
>>>(S
>>>olrDeleteDuplicates.java:270)
>>>at
>>>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next
>>>(S
>>>olrDeleteDuplicates.java:241)
>>>at
>>>org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.j
>>>av
>>>a:236)
>>>at
>>>org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:21
>>>6)
>>>at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>>>at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>>>at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>>at
>>>org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>>>
>>>
>>>Also had another question about updating Nutch to 1.6 and 1.7. I had
>>>tried
>>>updating to newer version of Nutch but got exception during deleting
>>>duplicates in SOLR. After lot of research online found that a field had
>>>changed. A few said digest field and others said that url field is no
>>>longer there. So here are my questions:
>>>1:  Is there a newer solr mapping file that needs to be used?
>>>2: Can the SOLR index from 1.5.1 and index from newer version co-exist or
>>>we need to re-index from one version of Nutch?
>>>
>>>I will really appreciate any help with this.
>>>
>>>
>>>Thanks in advance,
>>>Madhvi
>>>
>>>Madhvi Arora
>>>AutomationDirect
>>>The #1 Best Mid-Sized Company to work for in
>>>Atlanta<http://www.ajc.com/business/topworkplaces/automationdirect-com-to
>>>p-midsize-1421260.html> 2012
>>>
>

Reply via email to