Evgeny,

I doubt that it is possible to add a custom field before crawling
process. However, your problem has a solution: write an indexing plugin
that will be called at the indexing stage. You can easily add a custom
field at this point. You will have to put your urls and categories into
a database for fast access though.

Regards,

Arkadi

> -----Original Message-----
> From: Evgeny Zhulenev [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, April 01, 2008 8:52 AM
> To: [email protected]
> Subject: Custom fields
> 
> Hi.
> 
> I want to add some new fileds to each page. I found some articles
about
> it,
> that it's possible to do using plugins, but some things I still don't
> understand how to do.
> 
> For example I've got a list of sites from DMOZ. They are stored in
text
> file. Each line contains data in format: [url] [category1] [category2]
> [category3] - url of a page, and a list of categories in wich this
site is
> listed. One site can be listed in one, two or more categories at one
time.
> I
> want to start Nutch crawling this url's and to add category
information to
> each url. A field "category" that will contain a list of categories,
so
> that
> it would be possible to search only sites from given category. So how
is
> it
> possible to do?
> 
> All articles that I found could be applied when data for new custom
field
> retrieved from web page in crawling process (for example metadata from
> html
> tags). But how to add custom field data before crawling process.
> 
> Thanks


Reply via email to