Hi, I've been following this thread and happen to have a simple TruncatingFilter class I wrote for the same purpose. I think this should do what you want:
import java.io.IOException; import org.apache.lucene.analysis.TokenFilter; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; public class TruncatingFilter extends TokenFilter { private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class); private final int maxLength; protected TruncatingFilter(TokenStream input, int maxLength) { super(input); this.maxLength = maxLength; } @Override public boolean incrementToken() throws IOException { if (input.incrementToken()) { if (termAtt.length() > maxLength) { termAtt.setLength(maxLength); } return true; } else { return false; } } } Cheers, Geoff On Tue, Nov 13, 2012 at 7:54 AM, Erick Erickson <erickerick...@gmail.com>wrote: > There's nothing in Solr that I know of that does this. It would be a pretty > easy custom filter to create though.... > > FWIW, > Erick > > > On Tue, Nov 13, 2012 at 7:02 AM, Robert Muir <rcm...@gmail.com> wrote: > > > On Mon, Nov 12, 2012 at 10:47 PM, Bernd Fehling > > <bernd.fehl...@uni-bielefeld.de> wrote: > > > By the way, why does TrimFilter option updateOffset defaults to false, > > > just keep it backwards compatible? > > > > > > > In my opinion this option should be removed. > > > > TokenFilters shouldn't muck with offsets, for a lot of reasons, but > > especially because its too late to interact with any charfilter. > > > > This is the tokenizer's job. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >