Hi,
I'm quite new working with nutch plugins. I'm trying to save the
termfreqvectors of the documents.
I'm using nutch 1.4
I've seen that I had to use, in the plugin class, the method addFieldOption,
like:
--
public void addIndexBackendOptions(Configuration conf) {
// add lucene options //
// host is un-stored, indexed and tokenized
LuceneWriter.addFieldOptions("host", LuceneWriter.STORE.NO,
LuceneWriter.INDEX.TOKENIZED, conf);
// site is un-stored, indexed and un-tokenized
LuceneWriter.addFieldOptions("site", LuceneWriter.STORE.NO,
LuceneWriter.INDEX.UNTOKENIZED, conf);
// url is both stored and indexed, so it's both searchable and returned
LuceneWriter.addFieldOptions("url", LuceneWriter.STORE.YES,
LuceneWriter.INDEX.TOKENIZED, conf);
// content is indexed, so that it's searchable, but not stored in index
LuceneWriter.addFieldOptions("content", LuceneWriter.STORE.NO,
LuceneWriter.INDEX.TOKENIZED, conf);
// anchors are indexed, so they're searchable, but not stored in index
LuceneWriter.addFieldOptions("anchor", LuceneWriter.STORE.NO,
LuceneWriter.INDEX.TOKENIZED, conf);
// title is indexed and stored so that it can be displayed
LuceneWriter.addFieldOptions("title", LuceneWriter.STORE.YES,
LuceneWriter.INDEX.TOKENIZED, conf);
----
The problem is, as far as I have seen, that LuceneWriter no longer exists in
1.4 (Lucene 3.5)
WHich is the correct way to do it ?
Thank you very much in advance !
--