markrmiller commented on code in PR #4514:
URL: https://github.com/apache/solr/pull/4514#discussion_r3470082135
##########
solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java:
##########
@@ -17,75 +17,145 @@
package org.apache.solr.util.circuitbreaker;
-import java.io.IOException;
import java.lang.invoke.MethodHandles;
import java.lang.management.ManagementFactory;
import java.lang.management.MemoryMXBean;
+import java.lang.management.MemoryPoolMXBean;
+import java.lang.management.MemoryType;
+import java.lang.management.MemoryUsage;
+import java.util.ArrayList;
+import java.util.List;
import java.util.Locale;
-import org.apache.solr.util.RefCounted;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.regex.Pattern;
+import org.apache.solr.common.util.EnvUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
- * Tracks the current JVM heap usage and triggers if a moving heap usage
average over 30 seconds
- * exceeds the defined percentage of the maximum heap size allocated to the
JVM. Once the average
- * memory usage goes below the threshold, it will start allowing queries again.
+ * Trips when post-collection live data in the JVM heap exceeds a configured
percentage of the
+ * maximum heap size.
*
- * <p>The memory threshold is defined as a percentage of the maximum memory
allocated -- see
- * memThreshold in <code>solrconfig.xml</code>.
+ * <p>The signal is read from {@link MemoryPoolMXBean#getCollectionUsage()} on
the old/tenured heap
+ * pool, which reports memory usage immediately after the most recent
collection that affected that
+ * pool. This is the only memory reading that distinguishes "live data" from
"garbage waiting to be
+ * collected."
+ *
+ * <p>Earlier versions of this breaker sampled {@link
MemoryMXBean#getHeapMemoryUsage()} on a
+ * 30-second moving average, which produced a high signal during normal
operation: with a
+ * generational collector, {@code used} climbs toward {@code max} between
collections — that's the
+ * steady-state shape, not a problem. The new signal updates only when an
old-gen GC runs, which is
+ * the only point at which "how full is the heap really?" has a defined answer.
+ *
+ * <p>Pool selection by collector:
+ *
+ * <ul>
+ * <li><b>G1 / Parallel / Serial / generational ZGC:</b> uses the pool whose
name matches the
+ * word-boundary pattern {@code \b(Old|Tenured)\b}.
+ * <li><b>Non-generational ZGC and Shenandoah:</b> single combined heap pool
— the breaker sums
+ * {@code getCollectionUsage()} across every {@link MemoryType#HEAP}
pool instead.
+ * </ul>
+ *
+ * <p>Pre-first-GC, {@link MemoryPoolMXBean#getCollectionUsage()} can return
{@code null} on every
+ * pool; in that case the breaker reports {@code 0} live bytes and will not
trip until the JVM has
+ * performed at least one collection on a heap pool.
+ *
+ * <p>The threshold semantics are unchanged: configure a percentage of the
maximum heap size, and
Review Comment:
Same response as above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]