On Fri, Apr 16, 2004 at 08:59:42AM +0200, Magnus Johansson wrote: > Hi > > I'm developing an application using Lucene where I need to > be able to both search using a stemmer and sometimes using > "exact" search. > > I see two ways of doing this: > > 1. Use two indexes. One using a stemming analyzer and one using > a SimpleAnalyzer > > 2. Using duplicate fields. One field with stemmed content and > one with unstemmed content. (Perhaps the field CONTENT, will be > CONTENT and CONTENT_RAW) > > I'm leaning towards option 2. However I'm interested in any performance > implications. If I understand it correctly Lucene keeps separate > term-dictionaries for each field. So besides the index growing larger > (which might affect caching) it won't be any slower searching the index > with duplicate fields when I only query on the CONTENT field > > Is this correct? > > > Magnus
In the exact same situation I'm using your option 2. There may be some perfomance implication, but it's well under human recognition in my case. incze --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]