I am using Nutch 1.8 with Solr 4.3 and I want to index two custom meta tags
that we have on our site. I have followed the tutorial at
http://wiki.apache.org/nutch/IndexMetatags but I cannot get it to work. If I
run parsechecker, it shows that the fields are being parsed, but if I run
indexchecker, those fields do not appear. Nor do they appear in Solr. So it
appears that these tags are being parsed, but not indexed for some reason.
Here is the pertinent section of my nutch-site.xml which shows the
configuration for the parse-metadata and index-metadata plugins:
<property>
<name>plugin.includes</name>
<value>nutch-extensionpoints|lib-nekohtml|lib-http|lib-regex-filter|protocol-http|urlfilter-regex|parse-(html|tika|metatags)|index-(basic|anchor|metadata)|scoring-opic|urlnormalizer-(pass|regex|basic)|indexer-solr</value>
</property>
<property>
<name>metatags.names</name>
<value>groupsAllowed;gTitle</value>
</property>
<property>
<name>index.parse.md</name>
<value>metatag.groupsAllowed,metatag.gTitle</value>
</property>
I have added the following fields in schema-solr4.xml and in the solr schema as
well:
<field name="metatag.groupsAllowed" type="text_general" stored="true"
indexed="true"/>
<field name="metatag.gTitle" type="text_general" stored="true" indexed="true"/>
Any help would be greatly appreciated.
- Michael