Hi, In Nutch a `synthetic token` maps to a field/value pair. You need an indexing filter to read the key/value pair from the parsed metadata and add it as a field/value pair to the NutchDocument. You may also need a custom parser filter to extract the data from somewhere and store it to the parsed metadata as key/value, which you then further process in your indexing filter.
Check out the index-basic and index-more plugins for examples. Cheers, -----Original message----- > From:Jakub Moskal <[email protected]> > Sent: Mon 21-Jan-2013 04:58 > To: [email protected] > Subject: Synthetic Tokens > > Hi, > > I would like to develop a plugin that creates synthetic tokens for > some documents that are crawled by Nutch (as described here: > http://www.ideaeng.com/synthetic-tokens-need-p2-0604). How can this be > done in Nutch? Should I create a new field for every new synthetic > token, or should I add them to metadata? I'm not quite sure how > fields/metadata relate to the tokens described in the article. > > Thanks! > Jakub >

