tokers commented on code in PR #7906:
URL: https://github.com/apache/apisix/pull/7906#discussion_r969400166


##########
docs/en/latest/FAQ.md:
##########
@@ -626,6 +626,59 @@ This method only detects whether the APISIX data plane is 
alive or not. It does
 
 :::
 
+## What are the scenarios with high APISIX latency related to Etcd and how to 
fix them?
+
+Etcd is a component of APISIX service discovery and data storage, and its 
stability is related to the stability of APISIX.
+
+In actual scenarios, if APISIX uses a certificate to connect to Etcd through 
HTTPS, the following two problems of high latency for data query or writing may 
occur:
+
+1. Query or write data through APISIX Admin API.
+2. In the monitoring scenario, Prometheus crawls the APISIX data plane Metrics 
API timeout.
+
+These problems related to higher latency seriously affect the service 
stability of APISIX, and the reason why such problems occur is mainly because 
Etcd provides two modes of operation: HTTP (HTTPS) and gRPC. And APISIX uses 
the HTTP (HTTPS) protocol to operate Etcd.
+In this scenario, Etcd has a bug about HTTP2: if Etcd is operated over HTTPS 
(HTTP is not affected), the upper limit of HTTP2 connections is the default 250 
in Golang. Therefore, when the number of APISIX data plane nodes is large, once 
the number of connections between all APISIX nodes and Etcd exceeds this upper 
limit, the response of APISIX API interface will be very slow.

Review Comment:
   ```suggestion
   In this scenario, ETCD has a bug about HTTP/2: if ETCD is operated over 
HTTPS (HTTP is not affected), the upper limit of HTTP2 connections is the 
default `250` in Golang. Therefore, when the number of APISIX data plane nodes 
is large, once the number of connections between all APISIX nodes and Etcd 
exceeds this upper limit, the response of APISIX API interface will be very 
slow.
   ```



##########
docs/en/latest/FAQ.md:
##########
@@ -626,6 +626,59 @@ This method only detects whether the APISIX data plane is 
alive or not. It does
 
 :::
 
+## What are the scenarios with high APISIX latency related to Etcd and how to 
fix them?

Review Comment:
   ```suggestion
   ## What are the scenarios with high APISIX latency related to ETCD and how 
to fix them?
   ```



##########
docs/en/latest/FAQ.md:
##########
@@ -626,6 +626,59 @@ This method only detects whether the APISIX data plane is 
alive or not. It does
 
 :::
 
+## What are the scenarios with high APISIX latency related to Etcd and how to 
fix them?
+
+Etcd is a component of APISIX service discovery and data storage, and its 
stability is related to the stability of APISIX.
+
+In actual scenarios, if APISIX uses a certificate to connect to Etcd through 
HTTPS, the following two problems of high latency for data query or writing may 
occur:
+
+1. Query or write data through APISIX Admin API.
+2. In the monitoring scenario, Prometheus crawls the APISIX data plane Metrics 
API timeout.
+
+These problems related to higher latency seriously affect the service 
stability of APISIX, and the reason why such problems occur is mainly because 
Etcd provides two modes of operation: HTTP (HTTPS) and gRPC. And APISIX uses 
the HTTP (HTTPS) protocol to operate Etcd.
+In this scenario, Etcd has a bug about HTTP2: if Etcd is operated over HTTPS 
(HTTP is not affected), the upper limit of HTTP2 connections is the default 250 
in Golang. Therefore, when the number of APISIX data plane nodes is large, once 
the number of connections between all APISIX nodes and Etcd exceeds this upper 
limit, the response of APISIX API interface will be very slow.
+
+In Golang, the default upper limit of HTTP2 connections is 250, the code is as 
follows:
+
+```go
+package http2
+
+import ...
+
+const (
+       prefaceTimeout         = 10 * time.Second
+       firstSettingsTimeout   = 2 * time.Second // should be in-flight with 
preface anyway
+       handlerChunkWriteSize  = 4 << 10
+       defaultMaxStreams      = 250 // TODO: make this 100 as the GFE seems to?
+       maxQueuedControlFrames = 10000
+)
+
+```
+
+At present, Etcd officially maintains two main branches, 3.4 and 3.5.

Review Comment:
   ```suggestion
   At present, Etcd officially maintains two main branches, `3.4` and `3.5`.
   ```



##########
docs/en/latest/FAQ.md:
##########
@@ -626,6 +626,59 @@ This method only detects whether the APISIX data plane is 
alive or not. It does
 
 :::
 
+## What are the scenarios with high APISIX latency related to Etcd and how to 
fix them?
+
+Etcd is a component of APISIX service discovery and data storage, and its 
stability is related to the stability of APISIX.

Review Comment:
   Currently APISIX don't have the ETCD service discovery module.



##########
docs/en/latest/FAQ.md:
##########
@@ -626,6 +626,59 @@ This method only detects whether the APISIX data plane is 
alive or not. It does
 
 :::
 
+## What are the scenarios with high APISIX latency related to Etcd and how to 
fix them?
+
+Etcd is a component of APISIX service discovery and data storage, and its 
stability is related to the stability of APISIX.
+
+In actual scenarios, if APISIX uses a certificate to connect to Etcd through 
HTTPS, the following two problems of high latency for data query or writing may 
occur:
+
+1. Query or write data through APISIX Admin API.
+2. In the monitoring scenario, Prometheus crawls the APISIX data plane Metrics 
API timeout.
+
+These problems related to higher latency seriously affect the service 
stability of APISIX, and the reason why such problems occur is mainly because 
Etcd provides two modes of operation: HTTP (HTTPS) and gRPC. And APISIX uses 
the HTTP (HTTPS) protocol to operate Etcd.
+In this scenario, Etcd has a bug about HTTP2: if Etcd is operated over HTTPS 
(HTTP is not affected), the upper limit of HTTP2 connections is the default 250 
in Golang. Therefore, when the number of APISIX data plane nodes is large, once 
the number of connections between all APISIX nodes and Etcd exceeds this upper 
limit, the response of APISIX API interface will be very slow.
+
+In Golang, the default upper limit of HTTP2 connections is 250, the code is as 
follows:
+
+```go
+package http2
+
+import ...
+
+const (
+       prefaceTimeout         = 10 * time.Second
+       firstSettingsTimeout   = 2 * time.Second // should be in-flight with 
preface anyway
+       handlerChunkWriteSize  = 4 << 10
+       defaultMaxStreams      = 250 // TODO: make this 100 as the GFE seems to?
+       maxQueuedControlFrames = 10000
+)
+
+```
+
+At present, Etcd officially maintains two main branches, 3.4 and 3.5.
+The 3.4 branch has the recently released 3.4.20 which fixes this issue.
+As for the 3.5 branch, in fact, the official is preparing to release the 3.5.5 
version a long time ago, but it has not been released so far. So, if you are 
using a version of Etcd less than 3.5.5, there are several ways to solve this 
problem:
+
+1. Change the communication method between APISIX and Etcd from HTTPS to HTTP 
(not recommended).
+2. Fallback version to 3.4.20 (not recommended).
+3. Clone the Etcd source code and compile the release-3.5 branch directly 
(this branch has fixed the problem of HTTP2 connections, but the new version 
has not been released yet). This method is recommended.

Review Comment:
   That's also not a recommended way to change the ETCD source code since users 
may deploy ETCD via image.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to