Evgeny, I doubt that it is possible to add a custom field before crawling process. However, your problem has a solution: write an indexing plugin that will be called at the indexing stage. You can easily add a custom field at this point. You will have to put your urls and categories into a database for fast access though.
Regards, Arkadi > -----Original Message----- > From: Evgeny Zhulenev [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 01, 2008 8:52 AM > To: [email protected] > Subject: Custom fields > > Hi. > > I want to add some new fileds to each page. I found some articles about > it, > that it's possible to do using plugins, but some things I still don't > understand how to do. > > For example I've got a list of sites from DMOZ. They are stored in text > file. Each line contains data in format: [url] [category1] [category2] > [category3] - url of a page, and a list of categories in wich this site is > listed. One site can be listed in one, two or more categories at one time. > I > want to start Nutch crawling this url's and to add category information to > each url. A field "category" that will contain a list of categories, so > that > it would be possible to search only sites from given category. So how is > it > possible to do? > > All articles that I found could be applied when data for new custom field > retrieved from web page in crawling process (for example metadata from > html > tags). But how to add custom field data before crawling process. > > Thanks
