You should see it with the parsechecker tool but not with the indexchecker because you don't have an indexing filter plugin included that reads and emits what's output but the parse filter. Use the index-metadata plugin.

On Thu, 3 May 2012 00:25:42 -0700 (PDT), ML mail <[email protected]> wrote:
Dear Lewis,

Thanks for the README about the parse-metatags plugin. I have now
double checked and I have the metatags.names property in my
nutch-site.xml config file as well as the other required properties.
Still when running "nutch indexchecker URL" I don't see any
description or keywords fields :( 

Below I have pasted the relevant parts of my nutch-site.xml config file:

<property>
        <name>index.parse.md</name>
        <value>metatag.description,metatag.keywords</value>
</property>


<property>
        <name>metatags.names</name>
        <value>description;keywords</value>
</property>


<property>
        <name>plugin.includes</name>
       

<value>protocol-http|urlfilter-regex|parse-(html|tika|metatags)|index-(basic|anchor|metadata)|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>

As far as I know this all looks correct but maybe you can see
something wrong? or anything else I might check?

Regards



________________________________
 From: Lewis John Mcgibbney <[email protected]>
To: [email protected]; ML mail <[email protected]>
Sent: Wednesday, May 2, 2012 12:49 PM
Subject: Re: Indexing meta tags in Nutch 1.4

Hi,

Please also see the README Julien kindly provided with the
parse-metatags plugin.


https://svn.apache.org/viewvc/nutch/trunk/src/plugin/parse-metatags/README.txt?view=markup

I'm hoping there should be enough info to get it working flawlessly.
Remember, any changes you make to your config files should really be
recompiled before moving on to a more serious deployment.

On Tue, May 1, 2012 at 12:38 PM, ML mail <[email protected]> wrote:
Hi Lewis,

Thanks to your explanations, I managed to get the parse-metatags plugin built and installed into the runtime/local/plugins directory. So no I have the index-metatags from the ZIP file as well as the parse-metatags plugin from the patch installed and wanted to check if they are working. I followed step-by-step the guide on http://wiki.apache.org/nutch/IndexMetatags and came to the part where you check with the "nutch indexchecker URL" command for the metatag fields. Unfortunately, in the output of that command I don't see any keywords or description fields :( just the usual ones (site,title,content,etc).

Am I missing something here?

Also let me know if you need more details or my nutch-site.xml config file...

Regards

--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536600 / 06-50258350

Reply via email to