Hi,
my Nutch crawl job and the Indexing with solr works fine.Except for the
Subcollcetion. I configured the subcollcetion.xml
*<subcollections>
    <subcollection>
        <name>wiki</name>
        <id>wiki</id>
        <whitelist>/plugins/mediawiki/wiki/</whitelist>
        <blacklist />
    </subcollection>
</subcollections>*

and add the Plugin in teh nutch-site.xml
<configuration>
    <property>
        <name>http.agent.name</name>
        <value>mediawiki</value>
    </property>


    <property>
        <name>plugin.includes</name>

<value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor|more)|subcollection|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
    </property>

when I take a look with Luke to the Index there is no subcollcetion-field.

Have anybody exprience with this problem or an idea which may help?
Thanks and greetings

psimone

Reply via email to