Hello Mike,

Please remove the metatag.* prefix in the index.parse.md config and i think
you should be fine.

Regards,
Markus

Op ma 31 okt. 2022 om 12:32 schreef Mike <mz579...@gmail.com>:

> Yes, sorry, I also forgot to post this setting:
>
> <property>
>    <name>index.parse.md</name>
>
>
>  
> <value>metatag.description,metatag.keywords,metatag.rating,metatag.h1,metatag.h2,metatag.h3,metatag.h4,metatag.h5,metatag.h6</value>
>    <description>
>    Comma-separated list of keys to be taken from the parse metadata to
> generate fields.
>    Can be used e.g. for 'description' or 'keywords' provided that these
> values are generated
>    by a parser (see parse-metatags plugin)
>    </description>
> </property>
>
> The Nutch parsechecker shows me the fields but the indexchecker doesn't.
>
> Am Mo., 31. Okt. 2022 um 04:51 Uhr schrieb Mike <mz579...@gmail.com>:
>
> > Hello!
> >
> > I've tried everything and set everything up and get the nutch headings
> > plugin working:
> >
> > nutch-site.xml
> >
> > <property>protocol-okhttp
> >   <name>
> >
> >
> <value>protocol-okhttp|...|parse-(html|tika|text|metatags)|index-(basic|anchor|more|metadata)|...|headings|nutch-extensionpoints</value>
> > </property>
> >
> > schema.xml
> >
> >
> > <!-- fields for the headings plugin -->
> > <field name="h1" type="text_general" stored="true" indexed="true"
> > multiValued="true"/>
> > <field name="h2" type="text_general" stored="true" indexed="true"
> > multiValued="true"/>
> > <field name="h3" type="text_general" stored="true" indexed="true"
> > multiValued="true"/>
> > <field name="h4" type="text_general" stored="true" indexed="true"
> > multiValued="true"/>
> > <field name="h5" type="text_general" stored="true" indexed="true"
> > multiValued="true"/>
> > <field name="h6" type="text_general" stored="true" indexed="true"
> > multiValued="true"/>
> >
> > index-writers.xml
> >   <mapping>
> >       <rename>
> >         <field source="metatag.h1" dest="h1"/>
> >         <field source="metatag.h2" dest="h2"/>
> >         <field source="metatag.h3" dest="h3"/>
> >         <field source="metatag.h4" dest="h4"/>
> >         <field source="metatag.h5" dest="h5"/>
> >         <field source="metatag.h6" dest="h6"/>
> >       </rename>
> > ...
> >
> > After indexing to solr there are no HTML headings tags in my solr index,
> > what's missing?
> >
> > thanks!
> >
>

Reply via email to