> So I'm actually still confused why this float[256] stands out in your > measurejments vs two long[128]'s. Maybe its a profiler ghost?
Huh... that's a really good point. I'm going to spend a bit more time digging and see if I can reliably reproduce it on my own machine. I've just been comparing heap dumps from production hosts so far, so I'll try measuring in an environment where I can see what's going on. On Tue, May 2, 2023 at 1:14 PM Robert Muir <rcm...@gmail.com> wrote: > On Tue, May 2, 2023 at 3:24 PM Michael Froh <msf...@gmail.com> wrote: > > > > > This seems ok if it isn't invasive. I still feel like something is > > > "off" if you are seeing GC time from 1KB-per-segment allocation. Do > > > you have way too many segments? > > > > From what I saw, it's 1KB per "leaf query" to create the BM25Scorer > instance (at the Weight level), but then that BM25Scorer is shared across > all scorer (DISI) instances for all segments. So it doesn't scale with > segment count. It looks like the old logic used to allocate a SimScorer per > segment, so this is a big improvement in that regard (for scoring clauses, > since the non-scoring clauses had a super-lightweight SimScorer). > > > > In this particular case, they're running these gnarly machine-generated > BoolenQuery trees with at least 512 non-scoring TermQuery clauses (across a > bunch of different fields, so TermInSetQuery isn't an option). From what I > can see, each of those TermQueries produces a TermWeight that holds a > BM25Scorer that holds yet another instance of this float[256] array, for > 512KB+ of these caches per running query. It's definitely only going to be > an issue for folks who are flying close to the max clause count. > > > > Yeah, but the same situation could be said for buffers like this one: > > https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90PostingsReader.java#L311-L312 > So I'm actually still confused why this float[256] stands out in your > measurejments vs two long[128]'s. Maybe its a profiler ghost? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >