Hi Lucene community,

At LinkedIn, we use lucene in some important search apps. We recently found 
some problems with GC and memory footpring in one of them. We took a heap dump 
and analyzed it with JXRay (https://jxray.com). Unfortunately, we cannot share 
the entire jxray analysis due to security restrictions, but we can share one 
excerpt from it, see below. It comes from section 11 of jxray report, “Bad 
Primitive Arrays”, which tells us how much memory is wasted due to empty or 
under-utilized primitive arrays. That section says that nearly 4G of memory 
(25.6% of used heap) is wasted. And it turns out that most of that is due to 
byte[] arrays managed by SegmentTermsEnumFrame class.

[cid:f9518e74-7eac-4549-ba08-28734c245bc6]

To clarify: from the above screenshot, e.g. 80% of all arrays pointed by 
suffixBytes data field are just empty, i.e. contain only zeroes, which likely 
means that they have never been used. Of the remaining arrays, 3% are 
“trail-0s”, i.e. have more than a half of trailing zero elements, i.e. were 
only partially utilized. So only 17% of these arrays have been utilized more or 
less fully. The same is true for all other byte[] arrays managed by 
SegmentTermsEnumFrame. Note that from other sections of the heap dump it’s 
clear that the majority of these objects are garbage, i.e. they have already 
been used and discarded. Thus, at least 80% of memory that was allocated for 
these byte[] arrays has not been used and was wasted. From separate memory 
allocation profiling, we estimated that these arrays are responsible for 
~2G/sec of memory allocation. If they were allocated lazily rather than 
eagerly, i.e. just before they would be really used, we could potentially 
reduce their memory allocation rate share from 2G/sec to (1 - 0.8)*2 = 0.4 
G/sec.

A switch from eager to lazy allocation of some data structure is usually easy 
to implement. Let’s take a quick look at the source 
code<https://fossies.org/linux/www/lucene-10.3.2-src.tgz/lucene-10.3.2/lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene90/blocktree/SegmentTermsEnumFrame.java>.
 The suffixBytes array usage has the following pattern:

// Eager construction with hardcoded size
byte[] suffixBytes = new byte[128];

…  // Fast forward to the loadBlock() method
…
if (suffixBytes.length < numSuffixBytes) {
  // If we need to read more than 128 bytes, increase the array…
  // … or more precisely, throw away the old array and allocate another one
  suffixBytes = new byte[ArrayUtil.oversize(numSuffixBytes, 1)];
}

>From this code, it’s clear that two negative things can happen:

  1.
suffixBytes may not be used at all (the loadBlock() method may not be called or 
may return early). The memory used by the array will be completely wasted
  2.
If numSuffixBytes happens to be greater than 128, the current eagerly allocated 
array will be discarded. The memory used by it will be wasted.

And as our heap dump illustrates, these things likely happen very often. To 
address this problem, it would be sufficient to change the code as follows:

// Avoid eager construction
byte[] suffixBytes;
…
if (suffixByte == null || suffixBytes.length < numSuffixBytes) {
  // If we need to read more than 128 bytes, increase the array…
  // … or more precisely, throw away the old array and allocate another one
  suffixBytes = new byte[ArrayUtil.oversize(numSuffixBytes, 1)];
}

Note that reducing memory allocation rate results primarily in reduction of CPU 
usage and/or improved latency. That’s because each object allocation requires 
work from the JVM - updating pointers and setting all object bytes to zero. And 
then GCing these objects is also CPU-intensive, and results in pausing app 
threads, which affects latency. However, once memory allocation rate is 
reduced, it may be possible to also reduce the JVM heap memory. So the ultimate 
win is going to be in both CPU and memory.

Please let us know how we can proceed with this. The proposed change is 
trivial, and thus maybe it can be done quickly by some established Lucene 
contributor. If not, I guess I can make it myself and then hope that it goes 
through review and release in reasonable time.

Misha

Reply via email to