On Sun, 10 Nov 2002, Aaron Galea wrote: > I need to create a filter that extends a tokenfilter whose purpose is > to generate some synonyms for words in the document using Wordnet. > Well searching for synonyms using wordnet is not that problematic but > I need to add the synonym words to Lucene tokenstream before they are > passed for indexing. However TokenStream class does not support any > add method. Did anyone ever needed to do this? Can someone suggest an > alternative of how to add some synonym words to the index?
I have done some related work with Lucene involving query expansion (although not with Wordnet), but it's not clear to me what you're trying to accomplish with synonyms. It sounds to me as though you want to store all the synonyms of a word with the word in the index, in such a way that if a query involves any of them, they all respond. If you want to set it up so that all documents that contain synonyms of at least 1 query term to have positive scores, then it may be easier to do term expansion via post-processing of the query: at query time, get whatever related words you want (e.g., all synonyms of each term) from Wordnet, and add them to the query. I would guess that this would be fast enough for your purposes, is more flexible (in case you want to expand or contract your notion of a synonym), and requires no additional index space. If you have something else in mind, what is it? Regards, Joshua O'Madadhain (Incidentally, I'd be interested to hear--off-list--of your experiences with using Wordnet.) [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for--Bill Watterson My opinions are too rational and insightful to be those of any organization. -- To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@;jakarta.apache.org> For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>
