Thanks Jeven, more response inlined below: On Tue, May 14, 2024 at 12:58 PM Jerven Tjalling Bolleman <jerven.bolleman@sib.swiss> wrote:
The index that had an issue when merging into one segment definitely had > more than 1 billion times the word "positional" in it. I hope to be able > to give a closer number once re-indexing finished with a "work-around". > > Of course the "work-around" is to just fix this correctly by not having > that word so often in the index and definitely not as docs, freqs and > postings. > To be clear, indexing a given token like "positional" (nice token btw) as many times as you like into a Lucene index, even force merging down to a single segment, is perfectly allowed, and it certainly should not throw an exception, let alone a cryptic one like this! That's a valid use-case. So we really need to understand why you're even hitting an exception in the first place ... Mike McCandless http://blog.mikemccandless.com