I opened an issue for this one (
https://github.com/apache/lucene/issues/13373). Please feel free to edit or
add more info to it.

Regards,
Sanjay

On Wed, May 15, 2024 at 8:07 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> Thanks Jeven, more response inlined below:
>
> On Tue, May 14, 2024 at 12:58 PM Jerven Tjalling Bolleman
> <jerven.bolleman@sib.swiss> wrote:
>
> The index that had an issue when merging into one segment definitely had
> > more than 1 billion times the word "positional" in it. I hope to be able
> > to give a closer number once re-indexing finished with a "work-around".
> >
> > Of course the "work-around" is to just fix this correctly by not having
> > that word so often in the index and definitely not as docs, freqs and
> > postings.
> >
>
> To be clear, indexing a given token like "positional" (nice token btw) as
> many times as you like into a Lucene index, even force merging down to a
> single segment, is perfectly allowed, and it certainly should not throw an
> exception, let alone a cryptic one like this!  That's a valid use-case.
>
> So we really need to understand why you're even hitting an exception in the
> first place ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>

Reply via email to