You can construct your own analyzer by creating it from a pre-existing Tokenizer (e.g. WhiteSpaceTokenizer) and any number of TokenfFilters (e.g. TokenFilter). You can string any number of TokenFilters together to get many different effects.
But I have to ask, why you want to keep capitalization? and punctuation? Do you really want to fail to match text indexed with "Erickson, Erick" with the query "erick erickson"? That's often a source of frustration instead of goodness. HTH Erick On Tue, May 18, 2010 at 2:05 PM, Larry Hendrix <lahend...@wisc.edu> wrote: > Hi, > > Right now I'm using Lucene with a basic Whitespace Anayzer but I'm having > problems with stemming. Does anyone have a recommendation for other text > analyzers that handle stemming and also keep capitalization, stop words, and > punctuation? > > Thanks, > Larry > > > Larry A. Hendrix, Graduate Student > Computer Science Department > University of Wisconsin-Madison > 1300 University Ave Rm 6749 > Madison, WI 53711 > Office: (608) 263-7624 > lhend...@cs.wisc.edu > Grambling State University Alum > >