kezhenxu94 commented on code in PR #12953:
URL: https://github.com/apache/skywalking/pull/12953#discussion_r1910319129
##########
docs/en/setup/backend/circuit-breaking.md:
##########
@@ -0,0 +1,20 @@
+# Circuit Breaking
+
+Circuit breaking is a mechanism used to detect failures and encapsulate the
logic of preventing OAP node crashing. It is
+a key component of SkyWalking's resilience strategy. This approach protects
the system from overload and ensures
+stability.
+
+Currently, there are two available strategies for circuit breaking: heap
memory usage and direct memory pool size.
+
+```yaml
+# The int value of the max heap memory usage percent. The default value is 85%.
+maxHeapMemoryUsagePercent: ${SW_CORE_MAX_HEAP_MEMORY_USAGE_PERCENT:85}
+# The long value of the max direct memory usage. The default max value is -1,
representing no limit.
Review Comment:
Please add the unit here, I believe it is in bytes.
##########
docs/en/setup/backend/backend-telemetry.md:
##########
@@ -1,11 +1,11 @@
# Telemetry for backend
The OAP backend cluster itself is a distributed streaming process system. To
assist the Ops team, we provide the telemetry for the OAP backend itself, also
known as self-observability (so11y)
-By default, the telemetry is disabled by setting `selector` to `none`, like
this:
+By default, the telemetry is disabled by setting `selector` to `prometheus`,
like this, which activated the Prometheus telemetry.
Review Comment:
This looks like setting the selector to Prometheus will disable telemetry,
perhaps add a comma:
```suggestion
By default, the telemetry is disabled. By setting `selector` to
`prometheus`, like this, you can activate the Prometheus telemetry.
```
##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/watermark/WatermarkListener.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.skywalking.oap.server.core.watermark;
+
+import java.util.List;
+import lombok.Getter;
+
+/**
+ * WatermarkListener is the listener for receiving WatermarkEvent and react to
it.
+ * The implementations of this listener has two ways to interact with the
WatermarkEvent:
+ * 1. use {@link #isWatermarkExceeded()} to check if the watermark is exceeded.
+ * 2. override {@link #beAwareOf(WatermarkEvent.Type)} to react to the event.
+ *
+ * When the oap recovered from the limiting state, the listener has two ways
to be aware of it:
+ * 1. use {@link #isHealthy()} to check if the watermark is recovered.
+ * 2. Be notified by calling {@link #beAwareOfRecovery()}.
+ */
+public abstract class WatermarkListener {
+ @Getter
+ private String name;
+ private List<WatermarkEvent.Type> acceptedTypes;
+ private volatile boolean isWatermarkExceeded = false;
+
+ /**
+ * Create a listener that accepts all types of WatermarkEvent.
+ * This should be the default way to create a listener.
+ */
+ public WatermarkListener(String name) {
+ this(name, WatermarkEvent.Type.values());
+ }
+
+ public WatermarkListener(String name, WatermarkEvent.Type... types) {
+ this.acceptedTypes = List.of(types);
+ }
+
+ boolean notify(WatermarkEvent.Type event) {
+ if (acceptedTypes.contains(event)) {
+ isWatermarkExceeded = true;
+ beAwareOf(event);
+ return true;
+ }
+ return false;
+ }
+
+ void isHealthy() {
+ isWatermarkExceeded = false;
+ }
+
+ public boolean isWatermarkExceeded() {
+ return isWatermarkExceeded;
+ }
+
+ protected void beAwareOf(WatermarkEvent.Type event) {
+ }
+
+ ;
+
Review Comment:
```suggestion
```
##########
oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/watermark/WatermarkGRPCInterceptor.java:
##########
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.skywalking.oap.server.core.watermark;
+
+import io.grpc.ForwardingServerCallListener;
+import io.grpc.Metadata;
+import io.grpc.ServerCall;
+import io.grpc.ServerCallHandler;
+import io.grpc.ServerInterceptor;
+import io.grpc.Status;
+
+/**
+ * gRPCWatermarkInterceptor is a gRPC interceptor that checks if the watermark
is exceeded before processing the request.
+ */
+public class WatermarkGRPCInterceptor extends WatermarkListener implements
ServerInterceptor {
+ public static WatermarkGRPCInterceptor INSTANCE;
+
+ private WatermarkGRPCInterceptor() {
+ super("gRPC-Watermark-Interceptor");
+ }
+
+ public static WatermarkGRPCInterceptor create() {
+ INSTANCE = new WatermarkGRPCInterceptor();
+ return INSTANCE;
+ }
+
+ @Override
+ public <REQ, RESP> ServerCall.Listener<REQ> interceptCall(final
ServerCall<REQ, RESP> call,
+ final Metadata
headers,
+ final
ServerCallHandler<REQ, RESP> next) {
+ if (isWatermarkExceeded()) {
+ call.close(Status.RESOURCE_EXHAUSTED.withDescription("Watermark
exceeded"), new Metadata());
+ return new ServerCall.Listener<REQ>() {
+ };
+ }
Review Comment:
I'd suggest add some warning logs here as when this happens the metrics are
not accurate, although we can spot this in other ways like self observability
but logs are much more lightweight and easier to find the root cause why the
metrics are inaccurate
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]