[
https://issues.apache.org/jira/browse/NUTCH-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Boadu Akoto Charles Jnr updated NUTCH-1890:
-------------------------------------------
Description:
Problematic Page: https://wiki.apache.org/nutch/NutchTutorial
1. Duplicated Text
In section "6. Integrate Solr with Nutch" the following line is asked to be
commented from:
<!-- <filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/> -->
to
<!-- <filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/> -->
but I think it should rather read from:
<filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/>
to
<!-- <filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/> -->
was:
Problematic Page: https://wiki.apache.org/nutch/NutchTutorial
1. Duplicated Text
In section "6. Integrate Solr with Nutch" the following line is asked to be
commented from:
<!-- <filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/> -->
to
<!-- <filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/> -->
but I think it should rather read from:
<filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/>
to
<!-- <filter class="solr.
EnglishPorterFilterFactory" protected="protwords.txt"/> -->
2. Addition of extra step
After going through the recommended steps in Section 6 to integrate with solr,
I got an error. The error read 'field text not defined'. This error is so
because apparently in my solrconfig.xml, I had defined 'text' as my default
field but it was not defined the schema.xml that I imported from the nutch conf
folder.
I propose that either the schema.xml in the nutch conf folder be shipped with
the 'text' field already defined or an extra step be added to Section 6 that
reads:
Add the following line under the definition of 'content' field:
<field name="text" type="text" stored="true" indexed="true"/>
or better till steps be added to allow the user to change the default field in
solrconfig.xml from 'text' to 'content' whichever solution seems the most
appropriate.
> Major Typo in Documentation
> ---------------------------
>
> Key: NUTCH-1890
> URL: https://issues.apache.org/jira/browse/NUTCH-1890
> Project: Nutch
> Issue Type: Bug
> Components: documentation
> Affects Versions: 1.9
> Environment: web url: https://wiki.apache.org/nutch/NutchTutorial
> Reporter: Boadu Akoto Charles Jnr
> Labels: bug, docuentation, ommission
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Problematic Page: https://wiki.apache.org/nutch/NutchTutorial
> 1. Duplicated Text
> In section "6. Integrate Solr with Nutch" the following line is asked to be
> commented from:
> <!-- <filter class="solr.
> EnglishPorterFilterFactory" protected="protwords.txt"/> -->
> to
> <!-- <filter class="solr.
> EnglishPorterFilterFactory" protected="protwords.txt"/> -->
> but I think it should rather read from:
> <filter class="solr.
> EnglishPorterFilterFactory" protected="protwords.txt"/>
> to
> <!-- <filter class="solr.
> EnglishPorterFilterFactory" protected="protwords.txt"/> -->
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)