Hi I'm developing an application using Lucene where I need to be able to both search using a stemmer and sometimes using "exact" search.
I see two ways of doing this: 1. Use two indexes. One using a stemming analyzer and one using a SimpleAnalyzer 2. Using duplicate fields. One field with stemmed content and one with unstemmed content. (Perhaps the field CONTENT, will be CONTENT and CONTENT_RAW) I'm leaning towards option 2. However I'm interested in any performance implications. If I understand it correctly Lucene keeps separate term-dictionaries for each field. So besides the index growing larger (which might affect caching) it won't be any slower searching the index with duplicate fields when I only query on the CONTENT field Is this correct? Magnus --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]