sigram commented on a change in pull request #1626: URL: https://github.com/apache/lucene-solr/pull/1626#discussion_r446908349
########## File path: solr/contrib/prometheus-exporter/src/test-files/solr/collection1/conf/solrconfig.xml ########## @@ -83,6 +83,10 @@ <queryResultMaxDocsCached>200</queryResultMaxDocsCached> + <useCircuitBreakers>false</useCircuitBreakers> + + <memoryCircuitBreakerThreshold>100</memoryCircuitBreakerThreshold> Review comment: This is probably not needed in configs that don't actually use it (when useCircuitBreakers=false)? Also, to make it more future-proof, we could put these in a section - the expectation is that we will have at least one more (CPU breaker) and potentially other ones too, so instead of adding these new breakers as new elements at this level we could add them as a section (as sub-elements). ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java ########## @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.util.HashMap; +import java.util.Map; + +import org.apache.solr.core.SolrCore; + +/** + * Manages all registered circuit breaker instances. Responsible for a holistic view + * of whether a circuit breaker has tripped or not. + * + * There are two typical ways of using this class's instance: + * 1. Check if any circuit breaker has triggered -- and know which circuit breaker has triggered. + * 2. Get an instance of a specific circuit breaker and perform checks. + * Review comment: The following probably belongs to the SIP ... but the way I think about the common usage of this class for different code-paths is if breaker configs are labeled and correspond to different code-paths, eg.: * "query" -> one config * "index" -> another config * "foobar" -> yet another config, used perhaps in my custom component Current implementation limits us to use the same config for potentially very different code paths. ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java ########## @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.util.HashMap; +import java.util.Map; + +import org.apache.solr.core.SolrCore; + +/** + * Manages all registered circuit breaker instances. Responsible for a holistic view + * of whether a circuit breaker has tripped or not. + * + * There are two typical ways of using this class's instance: + * 1. Check if any circuit breaker has triggered -- and know which circuit breaker has triggered. + * 2. Get an instance of a specific circuit breaker and perform checks. + * + * It is a good practice to register new circuit breakers here if you want them checked for every + * request. + * + * NOTE: The current way of registering new default circuit breakers is minimal and not a long term + * solution. There will be a follow up with a SIP for a schema API design. + */ +public class CircuitBreakerManager { + + private final Map<CircuitBreakerType, CircuitBreaker> circuitBreakerMap = new HashMap<>(); + + // Allows replacing of existing circuit breaker + public void registerCircuitBreaker(CircuitBreakerType circuitBreakerType, CircuitBreaker circuitBreaker) { + circuitBreakerMap.put(circuitBreakerType, circuitBreaker); + } + + public CircuitBreaker getCircuitBreaker(CircuitBreakerType circuitBreakerType) { + assert circuitBreakerType != null; + + return circuitBreakerMap.get(circuitBreakerType); + } + + /** + * Check if any circuit breaker has triggered. + * @return CircuitBreakers which have triggered, null otherwise + */ + public Map<CircuitBreakerType, CircuitBreaker> checkAllCircuitBreakersAndReturnTrippedBreakers() { Review comment: OMG, what a name :) Maybe just `checkTrippedBreakers` ? ########## File path: solr/core/src/java/org/apache/solr/core/SolrConfig.java ########## @@ -804,6 +813,14 @@ private void initLibs(SolrResourceLoader loader, boolean isConfigsetTrusted) { loader.reloadLuceneSPI(); } + private void validateMemoryBreakerThreshold() { + if (useCircuitBreakers) { + if (memoryCircuitBreakerThreshold > 100 || memoryCircuitBreakerThreshold < 0) { + throw new IllegalArgumentException("memoryCircuitBreakerThreshold is not a valid percentage"); Review comment: I think 0 also doesn't make much sense. ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java ########## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.lang.management.ManagementFactory; +import java.lang.management.MemoryMXBean; + +import org.apache.solr.core.SolrCore; + +/** + * Tracks the current JVM heap usage and triggers if it exceeds the defined percentage of the maximum + * heap size allocated to the JVM. This circuit breaker is a part of the default CircuitBreakerManager + * so is checked for every request -- hence it is realtime. Once the memory usage goes below the threshold, + * it will start allowing queries again. + * + * The memory threshold is defined as a percentage of the maximum memory allocated -- see memoryCircuitBreakerThreshold + * in solrconfig.xml + */ + +public class MemoryCircuitBreaker extends CircuitBreaker { + private static final MemoryMXBean MEMORY_MX_BEAN = ManagementFactory.getMemoryMXBean(); + + private final long currentMaxHeap = MEMORY_MX_BEAN.getHeapMemoryUsage().getMax(); + + // Assumption -- the value of these parameters will be set correctly before invoking printDebugInfo() + private ThreadLocal<Long> seenMemory = new ThreadLocal<>(); + private ThreadLocal<Long> allowedMemory = new ThreadLocal<>(); + + public MemoryCircuitBreaker(SolrCore solrCore) { + super(solrCore); + + if (currentMaxHeap <= 0) { + throw new IllegalArgumentException("Invalid JVM state for the max heap usage"); + } + } + + // TODO: An optimization can be to trip the circuit breaker for a duration of time + // after the circuit breaker condition is matched. This will optimize for per call + // overhead of calculating the condition parameters but can result in false positives. + @Override + public boolean isCircuitBreakerGauntletTripped() { + if (!isCircuitBreakerEnabled()) { + return false; + } + + allowedMemory.set(getCurrentMemoryThreshold()); + + seenMemory.set(calculateLiveMemoryUsage()); + + return (seenMemory.get() >= allowedMemory.get()); + } + + @Override + public String printDebugInfo() { + return "seenMemory=" + seenMemory.get() + " allowedMemory=" + allowedMemory.get(); + } + + private long getCurrentMemoryThreshold() { + int thresholdValueInPercentage = solrCore.getSolrConfig().memoryCircuitBreakerThreshold; + double thresholdInFraction = thresholdValueInPercentage / (double) 100; Review comment: This can be calculated once in the constructor - IIRC if SolrConfig is updated the core is reloaded anyway, which will construct the breaker once again. ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java ########## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import org.apache.solr.core.SolrCore; + +/** + * Default class to define circuit breakers for Solr. + * + * There are two (typical) ways to use circuit breakers: + * 1. Have them checked at admission control by default (use CircuitBreakerManager for the same) + * 2. Use the circuit breaker in a specific code path(s) + * + * TODO: This class should be grown as the scope of circuit breakers grow. + */ +public abstract class CircuitBreaker { + public static final String NAME = "circuitbreaker"; + + protected final SolrCore solrCore; + + public CircuitBreaker(SolrCore solrCore) { + this.solrCore = solrCore; + } + + // Global config for all circuit breakers. For specific circuit breaker configs, define + // your own config + protected boolean isCircuitBreakerEnabled() { Review comment: Maybe `isEnabled` ? ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java ########## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import org.apache.solr.core.SolrCore; + +/** + * Default class to define circuit breakers for Solr. + * + * There are two (typical) ways to use circuit breakers: + * 1. Have them checked at admission control by default (use CircuitBreakerManager for the same) + * 2. Use the circuit breaker in a specific code path(s) + * + * TODO: This class should be grown as the scope of circuit breakers grow. + */ +public abstract class CircuitBreaker { + public static final String NAME = "circuitbreaker"; + + protected final SolrCore solrCore; + + public CircuitBreaker(SolrCore solrCore) { + this.solrCore = solrCore; + } + + // Global config for all circuit breakers. For specific circuit breaker configs, define + // your own config + protected boolean isCircuitBreakerEnabled() { + return solrCore.getSolrConfig().useCircuitBreakers; + } + + /** + * Check if this allocation will trigger circuit breaker. + */ + public abstract boolean isCircuitBreakerGauntletTripped(); + + /** + * Print debug useful info + */ + public abstract String printDebugInfo(); Review comment: This doesn't actually print anything, maybe name it `getDebugInfo` ? ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreaker.java ########## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import org.apache.solr.core.SolrCore; + +/** + * Default class to define circuit breakers for Solr. + * + * There are two (typical) ways to use circuit breakers: + * 1. Have them checked at admission control by default (use CircuitBreakerManager for the same) + * 2. Use the circuit breaker in a specific code path(s) + * + * TODO: This class should be grown as the scope of circuit breakers grow. + */ +public abstract class CircuitBreaker { + public static final String NAME = "circuitbreaker"; + + protected final SolrCore solrCore; + + public CircuitBreaker(SolrCore solrCore) { + this.solrCore = solrCore; + } + + // Global config for all circuit breakers. For specific circuit breaker configs, define + // your own config + protected boolean isCircuitBreakerEnabled() { + return solrCore.getSolrConfig().useCircuitBreakers; + } + + /** + * Check if this allocation will trigger circuit breaker. + */ + public abstract boolean isCircuitBreakerGauntletTripped(); Review comment: Maybe `isTripped` ? ########## File path: solr/solr-ref-guide/src/index.adoc ########## @@ -121,6 +122,8 @@ The *<<getting-started.adoc#getting-started,Getting Started>>* section guides yo *<<solrcloud.adoc#solrcloud,SolrCloud>>*: This section describes SolrCloud, which provides comprehensive distributed capabilities. *<<legacy-scaling-and-distribution.adoc#legacy-scaling-and-distribution,Legacy Scaling and Distribution>>*: This section tells you how to grow a Solr distribution by dividing a large index into sections called shards, which are then distributed across multiple servers, or by replicating a single index across multiple services. + +*<<circuit-breakers.adoc#circuit-breakers,Circuit Breakers>>*: This section talks about circuit breakers, a comprehensive way of allowing a higher stability of Solr nodes and predictability of request execution. Review comment: Erhm .. "a comprehensive way" it is not, but it is certainly "a way" :) I would also argue that it doesn't increase predictability of request execution because it introduces a new failure mode that clients have to handle - rather it's a way to ensure the service-level guarantees for requests that are accepted for execution. ########## File path: solr/solr-ref-guide/src/circuit-breakers.adoc ########## @@ -0,0 +1,81 @@ += Circuit Breakers +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr's circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The +premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current +resource configuration. + +== When To Use Circuit Breakers +Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers +are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503). + +It is upto the client to handle the same and potentially build a retrial logic as this should ideally be a transient situation. + +== Types Of Circuit Breakers +Circuit breakers can be of two types: + +=== Admission Control Checks + +Circuit breakers that are checked at admission control (request handlers). These circuit breakers are typically attached to a set +of requests that check them before proceeding with the request. Example is JVM heap usage based circuit breaker (described below). + +For these type of circuit breakers, it is a good idea to register them with CircuitBreakerManager +(org.apache.solr.util.circuitbreaker.CircuitBreakerManager) to allow a holistic check at the required admission control point. + +=== Custom Events/Code Paths Checks + +Circuit breakers that are needed only in special events or code paths. + + +== Circuit Breaker Configurations +The following flag controls the global activation/deactivation of circuit breakers. If this flag is disabled, all circuit breakers +will be disabled globally. Per circuit breaker configurations are specified in their respective sections later. + +[source,xml] +---- +<useCircuitBreakers>false</useCircuitBreakers> +---- + +== Currently Supported Circuit Breakers + +=== JVM Heap Usage Based Circuit Breaker +This circuit breaker tracks JVM heap memory usage and rejects incoming search requests with a 503 error code if the heap usage +exceeds a configured percentage of maximum heap allocated to the JVM (-XMax). The main configuration for this circuit breaker is +controlling the threshold percentage at which the JVM will trip. Review comment: "the JVM" -> "the breaker". ########## File path: solr/solr-ref-guide/src/circuit-breakers.adoc ########## @@ -0,0 +1,81 @@ += Circuit Breakers +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr's circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The +premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current +resource configuration. + +== When To Use Circuit Breakers +Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers +are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503). + +It is upto the client to handle the same and potentially build a retrial logic as this should ideally be a transient situation. + +== Types Of Circuit Breakers +Circuit breakers can be of two types: + +=== Admission Control Checks + +Circuit breakers that are checked at admission control (request handlers). These circuit breakers are typically attached to a set +of requests that check them before proceeding with the request. Example is JVM heap usage based circuit breaker (described below). + +For these type of circuit breakers, it is a good idea to register them with CircuitBreakerManager +(org.apache.solr.util.circuitbreaker.CircuitBreakerManager) to allow a holistic check at the required admission control point. + +=== Custom Events/Code Paths Checks + +Circuit breakers that are needed only in special events or code paths. + + +== Circuit Breaker Configurations +The following flag controls the global activation/deactivation of circuit breakers. If this flag is disabled, all circuit breakers +will be disabled globally. Per circuit breaker configurations are specified in their respective sections later. + +[source,xml] +---- +<useCircuitBreakers>false</useCircuitBreakers> +---- + +== Currently Supported Circuit Breakers + +=== JVM Heap Usage Based Circuit Breaker +This circuit breaker tracks JVM heap memory usage and rejects incoming search requests with a 503 error code if the heap usage +exceeds a configured percentage of maximum heap allocated to the JVM (-XMax). The main configuration for this circuit breaker is +controlling the threshold percentage at which the JVM will trip. + +[source,xml] +---- +<memoryCircuitBreakerThreshold>75</memoryCircuitBreakerThreshold> +---- + +Consider the following example: + +JVM has been allocated a maximum heap of 5GB (-XMax) and memoryCircuitBreakerThreshold is set to 75. In this scenario, the heap usage +at which the circuit breaker will trip is 3.75GB. + +Note that this circuit breaker is checked for each incoming search request and considers the live state of the node i.e every search Review comment: "Live state" in SolrCloud terminology is a loaded term ... perhaps it's better to simply use the "current heap usage". ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java ########## @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.util.HashMap; +import java.util.Map; + +import org.apache.solr.core.SolrCore; + +/** + * Manages all registered circuit breaker instances. Responsible for a holistic view + * of whether a circuit breaker has tripped or not. + * + * There are two typical ways of using this class's instance: + * 1. Check if any circuit breaker has triggered -- and know which circuit breaker has triggered. + * 2. Get an instance of a specific circuit breaker and perform checks. + * + * It is a good practice to register new circuit breakers here if you want them checked for every + * request. + * + * NOTE: The current way of registering new default circuit breakers is minimal and not a long term + * solution. There will be a follow up with a SIP for a schema API design. + */ +public class CircuitBreakerManager { + + private final Map<CircuitBreakerType, CircuitBreaker> circuitBreakerMap = new HashMap<>(); + + // Allows replacing of existing circuit breaker + public void registerCircuitBreaker(CircuitBreakerType circuitBreakerType, CircuitBreaker circuitBreaker) { + circuitBreakerMap.put(circuitBreakerType, circuitBreaker); + } + + public CircuitBreaker getCircuitBreaker(CircuitBreakerType circuitBreakerType) { + assert circuitBreakerType != null; + + return circuitBreakerMap.get(circuitBreakerType); + } + + /** + * Check if any circuit breaker has triggered. + * @return CircuitBreakers which have triggered, null otherwise + */ + public Map<CircuitBreakerType, CircuitBreaker> checkAllCircuitBreakersAndReturnTrippedBreakers() { + Map<CircuitBreakerType, CircuitBreaker> triggeredCircuitBreakers = null; + + for (Map.Entry<CircuitBreakerType, CircuitBreaker> entry : circuitBreakerMap.entrySet()) { + CircuitBreaker circuitBreaker = entry.getValue(); + + if (circuitBreaker.isCircuitBreakerEnabled() && + circuitBreaker.isCircuitBreakerGauntletTripped()) { + if (triggeredCircuitBreakers == null) { + triggeredCircuitBreakers = new HashMap<>(); + } + + triggeredCircuitBreakers.put(entry.getKey(), circuitBreaker); + } + } + + return triggeredCircuitBreakers; + } + + /** + * Returns true if *any* circuit breaker has triggered, false if none have triggered + * + * NOTE: This method short circuits the checking of circuit breakers -- the method will + * return as soon as it finds a circuit breaker that is enabled and has triggered + */ + public boolean checkAllCircuitBreakers() { + for (Map.Entry<CircuitBreakerType, CircuitBreaker> entry : circuitBreakerMap.entrySet()) { + CircuitBreaker circuitBreaker = entry.getValue(); + + if (circuitBreaker.isCircuitBreakerEnabled() && + circuitBreaker.isCircuitBreakerGauntletTripped()) { + return true; + } + } + + return false; + } + + /** + * Construct the final error message to be printed when circuit breakers trip + * @param circuitBreakerMap Input list for circuit breakers + * @return Constructed error message + */ + public static String constructFinalErrorMessageString(Map<CircuitBreakerType, CircuitBreaker> circuitBreakerMap) { + assert circuitBreakerMap != null; + + StringBuilder sb = new StringBuilder(); + + for (CircuitBreakerType circuitBreakerType : circuitBreakerMap.keySet()) { + sb.append(circuitBreakerType.toString() + " " + circuitBreakerMap.get(circuitBreakerType).printDebugInfo()); + } + + return sb.toString(); + } + + /** + * Register default circuit breakers and return a constructed CircuitBreakerManager + * instance which serves the given circuit breakers. + * + * Any default circuit breakers should be registered here + */ + public static CircuitBreakerManager buildDefaultCircuitBreakerManager(SolrCore solrCore) { Review comment: Do we actually need `SolrCore` here, or just the breakers' config (currently in `SolrConfig`)? ########## File path: solr/solr-ref-guide/src/query-settings-in-solrconfig.adoc ########## @@ -170,6 +170,26 @@ This parameter sets the maximum number of documents to cache for any entry in th <queryResultMaxDocsCached>200</queryResultMaxDocsCached> ---- +=== useCircuitBreakers + +Global control flag for enabling circuit breakers + +[source,xml] +---- +<useCircuitBreakers>true</useCircuitBreakers> +---- + +=== memoryCircuitBreakerThreshold + +Memory threshold in percentage for JVM heap usage defined in percentage of maximum heap allocated + +to the JVM (-XMax). Review comment: -Xmx ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java ########## @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.util.HashMap; +import java.util.Map; + +import org.apache.solr.core.SolrCore; + +/** + * Manages all registered circuit breaker instances. Responsible for a holistic view + * of whether a circuit breaker has tripped or not. + * + * There are two typical ways of using this class's instance: + * 1. Check if any circuit breaker has triggered -- and know which circuit breaker has triggered. + * 2. Get an instance of a specific circuit breaker and perform checks. + * + * It is a good practice to register new circuit breakers here if you want them checked for every + * request. + * + * NOTE: The current way of registering new default circuit breakers is minimal and not a long term + * solution. There will be a follow up with a SIP for a schema API design. + */ +public class CircuitBreakerManager { + + private final Map<CircuitBreakerType, CircuitBreaker> circuitBreakerMap = new HashMap<>(); + + // Allows replacing of existing circuit breaker + public void registerCircuitBreaker(CircuitBreakerType circuitBreakerType, CircuitBreaker circuitBreaker) { + circuitBreakerMap.put(circuitBreakerType, circuitBreaker); + } + + public CircuitBreaker getCircuitBreaker(CircuitBreakerType circuitBreakerType) { + assert circuitBreakerType != null; + + return circuitBreakerMap.get(circuitBreakerType); + } + + /** + * Check if any circuit breaker has triggered. + * @return CircuitBreakers which have triggered, null otherwise + */ + public Map<CircuitBreakerType, CircuitBreaker> checkAllCircuitBreakersAndReturnTrippedBreakers() { + Map<CircuitBreakerType, CircuitBreaker> triggeredCircuitBreakers = null; + + for (Map.Entry<CircuitBreakerType, CircuitBreaker> entry : circuitBreakerMap.entrySet()) { + CircuitBreaker circuitBreaker = entry.getValue(); + + if (circuitBreaker.isCircuitBreakerEnabled() && + circuitBreaker.isCircuitBreakerGauntletTripped()) { + if (triggeredCircuitBreakers == null) { + triggeredCircuitBreakers = new HashMap<>(); + } + + triggeredCircuitBreakers.put(entry.getKey(), circuitBreaker); + } + } + + return triggeredCircuitBreakers; + } + + /** + * Returns true if *any* circuit breaker has triggered, false if none have triggered + * + * NOTE: This method short circuits the checking of circuit breakers -- the method will + * return as soon as it finds a circuit breaker that is enabled and has triggered + */ + public boolean checkAllCircuitBreakers() { Review comment: Maybe `checkAnyBreakerTripped` ? Because we don't actually check all breakers here. ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java ########## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.lang.management.ManagementFactory; +import java.lang.management.MemoryMXBean; + +import org.apache.solr.core.SolrCore; + +/** + * Tracks the current JVM heap usage and triggers if it exceeds the defined percentage of the maximum + * heap size allocated to the JVM. This circuit breaker is a part of the default CircuitBreakerManager + * so is checked for every request -- hence it is realtime. Once the memory usage goes below the threshold, + * it will start allowing queries again. + * + * The memory threshold is defined as a percentage of the maximum memory allocated -- see memoryCircuitBreakerThreshold + * in solrconfig.xml + */ + +public class MemoryCircuitBreaker extends CircuitBreaker { + private static final MemoryMXBean MEMORY_MX_BEAN = ManagementFactory.getMemoryMXBean(); + + private final long currentMaxHeap = MEMORY_MX_BEAN.getHeapMemoryUsage().getMax(); + + // Assumption -- the value of these parameters will be set correctly before invoking printDebugInfo() + private ThreadLocal<Long> seenMemory = new ThreadLocal<>(); + private ThreadLocal<Long> allowedMemory = new ThreadLocal<>(); + + public MemoryCircuitBreaker(SolrCore solrCore) { + super(solrCore); + + if (currentMaxHeap <= 0) { + throw new IllegalArgumentException("Invalid JVM state for the max heap usage"); + } + } + + // TODO: An optimization can be to trip the circuit breaker for a duration of time + // after the circuit breaker condition is matched. This will optimize for per call + // overhead of calculating the condition parameters but can result in false positives. + @Override + public boolean isCircuitBreakerGauntletTripped() { + if (!isCircuitBreakerEnabled()) { + return false; + } + + allowedMemory.set(getCurrentMemoryThreshold()); + + seenMemory.set(calculateLiveMemoryUsage()); + + return (seenMemory.get() >= allowedMemory.get()); + } + + @Override + public String printDebugInfo() { + return "seenMemory=" + seenMemory.get() + " allowedMemory=" + allowedMemory.get(); + } + + private long getCurrentMemoryThreshold() { + int thresholdValueInPercentage = solrCore.getSolrConfig().memoryCircuitBreakerThreshold; + double thresholdInFraction = thresholdValueInPercentage / (double) 100; + long actualLimit = (long) (currentMaxHeap * thresholdInFraction); + + if (actualLimit <= 0) { + throw new IllegalStateException("Memory limit cannot be less than or equal to zero"); + } + + return actualLimit; + } + + /** + * Calculate the live memory usage for the system. This method has package visibility + * to allow using for testing + * @return Memory usage in bytes + */ + protected long calculateLiveMemoryUsage() { + // NOTE: MemoryUsageGaugeSet provides memory usage statistics but we do not use them + // here since MemoryUsageGaugeSet provides combination of heap and non heap usage and Review comment: This comment is somewhat misleading ... if I correctly understand the intent :) MemoryUsageGaugeSet does provide the heap and the non-heap usages separately, so it's possible to get the value we want from it - but it incurs unnecessary cost and additional allocations, so we can do it cheaper by using MemoryMXBean directly. ########## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/MemoryCircuitBreaker.java ########## @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.lang.management.ManagementFactory; +import java.lang.management.MemoryMXBean; + +import org.apache.solr.core.SolrCore; + +/** + * Tracks the current JVM heap usage and triggers if it exceeds the defined percentage of the maximum + * heap size allocated to the JVM. This circuit breaker is a part of the default CircuitBreakerManager + * so is checked for every request -- hence it is realtime. Once the memory usage goes below the threshold, + * it will start allowing queries again. + * + * The memory threshold is defined as a percentage of the maximum memory allocated -- see memoryCircuitBreakerThreshold Review comment: It needs <p> tags to actually make a new paragraph. ########## File path: solr/solr-ref-guide/src/circuit-breakers.adoc ########## @@ -0,0 +1,81 @@ += Circuit Breakers +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr's circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The +premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current +resource configuration. + +== When To Use Circuit Breakers +Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers +are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503). + +It is upto the client to handle the same and potentially build a retrial logic as this should ideally be a transient situation. + +== Types Of Circuit Breakers +Circuit breakers can be of two types: Review comment: I don't think this distinction merits a separate section - these are not actually different types of breakers, they are just different usage scenarios, and only one is currently implemented. ########## File path: solr/solr-ref-guide/src/circuit-breakers.adoc ########## @@ -0,0 +1,81 @@ += Circuit Breakers +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr's circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The +premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current +resource configuration. + +== When To Use Circuit Breakers +Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers +are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503). + +It is upto the client to handle the same and potentially build a retrial logic as this should ideally be a transient situation. + +== Types Of Circuit Breakers +Circuit breakers can be of two types: + +=== Admission Control Checks + +Circuit breakers that are checked at admission control (request handlers). These circuit breakers are typically attached to a set +of requests that check them before proceeding with the request. Example is JVM heap usage based circuit breaker (described below). + +For these type of circuit breakers, it is a good idea to register them with CircuitBreakerManager +(org.apache.solr.util.circuitbreaker.CircuitBreakerManager) to allow a holistic check at the required admission control point. + +=== Custom Events/Code Paths Checks + +Circuit breakers that are needed only in special events or code paths. + + +== Circuit Breaker Configurations +The following flag controls the global activation/deactivation of circuit breakers. If this flag is disabled, all circuit breakers +will be disabled globally. Per circuit breaker configurations are specified in their respective sections later. + +[source,xml] +---- +<useCircuitBreakers>false</useCircuitBreakers> +---- + +== Currently Supported Circuit Breakers + +=== JVM Heap Usage Based Circuit Breaker +This circuit breaker tracks JVM heap memory usage and rejects incoming search requests with a 503 error code if the heap usage +exceeds a configured percentage of maximum heap allocated to the JVM (-XMax). The main configuration for this circuit breaker is Review comment: The JVM arg is `-Xmx` ########## File path: solr/solr-ref-guide/src/circuit-breakers.adoc ########## @@ -0,0 +1,81 @@ += Circuit Breakers +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr's circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The +premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current +resource configuration. + +== When To Use Circuit Breakers +Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers +are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503). + +It is upto the client to handle the same and potentially build a retrial logic as this should ideally be a transient situation. + +== Types Of Circuit Breakers +Circuit breakers can be of two types: + +=== Admission Control Checks + +Circuit breakers that are checked at admission control (request handlers). These circuit breakers are typically attached to a set +of requests that check them before proceeding with the request. Example is JVM heap usage based circuit breaker (described below). + +For these type of circuit breakers, it is a good idea to register them with CircuitBreakerManager +(org.apache.solr.util.circuitbreaker.CircuitBreakerManager) to allow a holistic check at the required admission control point. Review comment: The doc should mention somewhere that this is currently the only scenario supported by default. ########## File path: solr/solr-ref-guide/src/circuit-breakers.adoc ########## @@ -0,0 +1,81 @@ += Circuit Breakers +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +Solr's circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The +premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current +resource configuration. + +== When To Use Circuit Breakers +Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers +are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503). + +It is upto the client to handle the same and potentially build a retrial logic as this should ideally be a transient situation. + +== Types Of Circuit Breakers +Circuit breakers can be of two types: + +=== Admission Control Checks + +Circuit breakers that are checked at admission control (request handlers). These circuit breakers are typically attached to a set +of requests that check them before proceeding with the request. Example is JVM heap usage based circuit breaker (described below). + +For these type of circuit breakers, it is a good idea to register them with CircuitBreakerManager +(org.apache.solr.util.circuitbreaker.CircuitBreakerManager) to allow a holistic check at the required admission control point. + +=== Custom Events/Code Paths Checks + +Circuit breakers that are needed only in special events or code paths. + + +== Circuit Breaker Configurations +The following flag controls the global activation/deactivation of circuit breakers. If this flag is disabled, all circuit breakers +will be disabled globally. Per circuit breaker configurations are specified in their respective sections later. + +[source,xml] +---- +<useCircuitBreakers>false</useCircuitBreakers> +---- + +== Currently Supported Circuit Breakers + +=== JVM Heap Usage Based Circuit Breaker +This circuit breaker tracks JVM heap memory usage and rejects incoming search requests with a 503 error code if the heap usage +exceeds a configured percentage of maximum heap allocated to the JVM (-XMax). The main configuration for this circuit breaker is +controlling the threshold percentage at which the JVM will trip. + +[source,xml] +---- +<memoryCircuitBreakerThreshold>75</memoryCircuitBreakerThreshold> +---- + +Consider the following example: + +JVM has been allocated a maximum heap of 5GB (-XMax) and memoryCircuitBreakerThreshold is set to 75. In this scenario, the heap usage Review comment: -Xmx ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org