We have a TruncateTokenFilter in lucene/analysis/common. :) On Fri, Sep 23, 2022 at 4:39 PM Michael Sokolov <msoko...@gmail.com> wrote:
> I wonder if it would make sense to provide a TruncationFilter in > addition to the LengthFilter. That way long tokens in source text > could be better supported, albeit with some confusion if they share > the same very long prefix... > > On Fri, Sep 23, 2022 at 9:56 AM Scott Guthery <sguth...@gmail.com> wrote: > > > > Thanks much, Adrian. I hadn't realized that the size limit was on one > > token in the text as opposed to being a limit on the length of the entire > > text field. I'm loading patents, so I suspect that the very long word > is a > > DNA sequence. > > > > Thanks also for your guidance with regard to setting maximums. > > > > Cheers, Scott > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Adrien