We have a TruncateTokenFilter in lucene/analysis/common. :)

On Fri, Sep 23, 2022 at 4:39 PM Michael Sokolov <msoko...@gmail.com> wrote:

> I wonder if it would make sense to provide a TruncationFilter in
> addition to the LengthFilter. That way long tokens in source text
> could be better supported, albeit with some confusion if they share
> the same very long prefix...
>
> On Fri, Sep 23, 2022 at 9:56 AM Scott Guthery <sguth...@gmail.com> wrote:
> >
> > Thanks much, Adrian.  I hadn't realized that the size limit was on one
> > token in the text as opposed to being a limit on the length of the entire
> > text field.  I'm loading patents, so I suspect that the very long word
> is a
> > DNA sequence.
> >
> > Thanks also for your guidance with regard to setting maximums.
> >
> > Cheers, Scott
> >
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-- 
Adrien

Reply via email to