Hi Lucene community, At LinkedIn, we use lucene in some important search apps. We recently found some problems with GC and memory footpring in one of them. We took a heap dump and analyzed it with JXRay (https://jxray.com). Unfortunately, we cannot share the entire jxray analysis due to security restrictions, but we can share one excerpt from it, see below. It comes from section 11 of jxray report, “Bad Primitive Arrays”, which tells us how much memory is wasted due to empty or under-utilized primitive arrays. That section says that nearly 4G of memory (25.6% of used heap) is wasted. And it turns out that most of that is due to byte[] arrays managed by SegmentTermsEnumFrame class.
[cid:f9518e74-7eac-4549-ba08-28734c245bc6] To clarify: from the above screenshot, e.g. 80% of all arrays pointed by suffixBytes data field are just empty, i.e. contain only zeroes, which likely means that they have never been used. Of the remaining arrays, 3% are “trail-0s”, i.e. have more than a half of trailing zero elements, i.e. were only partially utilized. So only 17% of these arrays have been utilized more or less fully. The same is true for all other byte[] arrays managed by SegmentTermsEnumFrame. Note that from other sections of the heap dump it’s clear that the majority of these objects are garbage, i.e. they have already been used and discarded. Thus, at least 80% of memory that was allocated for these byte[] arrays has not been used and was wasted. From separate memory allocation profiling, we estimated that these arrays are responsible for ~2G/sec of memory allocation. If they were allocated lazily rather than eagerly, i.e. just before they would be really used, we could potentially reduce their memory allocation rate share from 2G/sec to (1 - 0.8)*2 = 0.4 G/sec. A switch from eager to lazy allocation of some data structure is usually easy to implement. Let’s take a quick look at the source code<https://fossies.org/linux/www/lucene-10.3.2-src.tgz/lucene-10.3.2/lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene90/blocktree/SegmentTermsEnumFrame.java>. The suffixBytes array usage has the following pattern: // Eager construction with hardcoded size byte[] suffixBytes = new byte[128]; … // Fast forward to the loadBlock() method … if (suffixBytes.length < numSuffixBytes) { // If we need to read more than 128 bytes, increase the array… // … or more precisely, throw away the old array and allocate another one suffixBytes = new byte[ArrayUtil.oversize(numSuffixBytes, 1)]; } >From this code, it’s clear that two negative things can happen: 1. suffixBytes may not be used at all (the loadBlock() method may not be called or may return early). The memory used by the array will be completely wasted 2. If numSuffixBytes happens to be greater than 128, the current eagerly allocated array will be discarded. The memory used by it will be wasted. And as our heap dump illustrates, these things likely happen very often. To address this problem, it would be sufficient to change the code as follows: // Avoid eager construction byte[] suffixBytes; … if (suffixByte == null || suffixBytes.length < numSuffixBytes) { // If we need to read more than 128 bytes, increase the array… // … or more precisely, throw away the old array and allocate another one suffixBytes = new byte[ArrayUtil.oversize(numSuffixBytes, 1)]; } Note that reducing memory allocation rate results primarily in reduction of CPU usage and/or improved latency. That’s because each object allocation requires work from the JVM - updating pointers and setting all object bytes to zero. And then GCing these objects is also CPU-intensive, and results in pausing app threads, which affects latency. However, once memory allocation rate is reduced, it may be possible to also reduce the JVM heap memory. So the ultimate win is going to be in both CPU and memory. Please let us know how we can proceed with this. The proposed change is trivial, and thus maybe it can be done quickly by some established Lucene contributor. If not, I guess I can make it myself and then hope that it goes through review and release in reasonable time. Misha
