Hi Scott,

There is no way to lift this limit. The assumption is that a user would
never type a 32kB keyword in a search bar, so indexing such long keywords
is wasteful. Some tokenizers like StandardTokenizer can be configured to
limit the length of the tokens that they produce, there is also a
LengthFilter that can be appended to the analysis chain to filter out
tokens that exceed the maximum term length.

I would note that modifying the source code is going to require more than
bumping the hardcoded limit as we rely on this limit in a few places, e.g.
ByteBlockPool.

On Fri, Sep 23, 2022 at 12:59 AM Scott Guthery <sguth...@gmail.com> wrote:

> Lucene 9.3 seems to have a (post-Analyzer) maximum field length of 32767.
> Is there a way of increasing this without resorting to the source code?
>
> Thanks for any guidance.
>
> Cheers, Scott
>


-- 
Adrien

Reply via email to