Hi,

I've been following this thread and happen to have a simple
TruncatingFilter class I wrote for the same purpose.  I think this should
do what you want:



import java.io.IOException;

import org.apache.lucene.analysis.TokenFilter;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;

public class TruncatingFilter extends TokenFilter {
    private final CharTermAttribute termAtt =
addAttribute(CharTermAttribute.class);
    private final int maxLength;

    protected TruncatingFilter(TokenStream input, int maxLength) {
        super(input);
        this.maxLength = maxLength;
    }

    @Override
    public boolean incrementToken() throws IOException {
        if (input.incrementToken()) {
            if (termAtt.length() > maxLength) {
                termAtt.setLength(maxLength);
            }

            return true;
        } else {
            return false;
        }
    }

}

Cheers,
Geoff


On Tue, Nov 13, 2012 at 7:54 AM, Erick Erickson <erickerick...@gmail.com>wrote:

> There's nothing in Solr that I know of that does this. It would be a pretty
> easy custom filter to create though....
>
> FWIW,
> Erick
>
>
> On Tue, Nov 13, 2012 at 7:02 AM, Robert Muir <rcm...@gmail.com> wrote:
>
> > On Mon, Nov 12, 2012 at 10:47 PM, Bernd Fehling
> > <bernd.fehl...@uni-bielefeld.de> wrote:
> > > By the way, why does TrimFilter option updateOffset defaults to false,
> > > just keep it backwards compatible?
> > >
> >
> > In my opinion this option should be removed.
> >
> > TokenFilters shouldn't muck with offsets, for a lot of reasons, but
> > especially because its too late to interact with any charfilter.
> >
> > This is the tokenizer's job.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>

Reply via email to