http://issues.apache.org/jira/browse/NUTCH-192
Am 31.01.2006 um 23:35 schrieb Vanderdray, Jacob:
What's jira? I'm actually in the process of writing a set of
plugins to process meta tags that you define in the nutch-site.xml
file,
so I'd be interested in reading about what's being worked on.
Thanks,
Jake.
-----Original Message-----
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 31, 2006 5:25 PM
To: [email protected]
Subject: Re: adding meta to domain
Meta data support is actually under developerment and come soon. See
jira for latest discussion.
In any case you can write already a index filter plugin, see the cool
fresh wiki documentation for that.
Am 31.01.2006 um 23:25 schrieb Sunnyvale Fl:
I need to add some meta data to the index at crawl time and I am
wondering
what is the best way to do it. For example, for everything in site
www.foo.com, I need to add a meta tag that says source=branchA, and
for
www.bar.com source=branchB. These meta data are NOT directly
available from
the source content but can be found from a table lookup of key:URL to
value:meta. I am thinking I can write a plugin to index an
additional meta
field, but what would be the best way to tweak the crawler to do
the table
lookup?