jinjianming opened a new issue, #9946: URL: https://github.com/apache/apisix/issues/9946
### Description I hope to have the best solution for optimizing ETCD and a Prometheus alarm strategy to ensure APISIX stability Just yesterday, The APISIX produced has crashed, and I believe the following two points should be taken seriously; 1. Firstly, because etcd has reached the default storage 2G limit ,Causing upstream data to be unable to change, resulting in gateway unavailability; - I think it should be optimized [limit](https://etcd.io/docs/v3.3/dev-guide/limit/)&[auto-compaction](https://etcd.io/docs/v3.5/op-guide/maintenance/#auto-compaction ),I don't know if it's the best state; - Chart, please refer to here to add the configuration to handle the first issue (https://github.com/bitnami/charts/issues/8516); - If the limit has been exceeded, it is necessary to add the parameters and manually cancel the alarm (https://github.com/bitnami/charts/issues/18073) ; 2. In terms of monitoring, I use `apisix_etcd_reachable{job="Produce-ApiSix"} == 0`It was found that the alarm could not be successful because if it hangs, this value will be null, lacking the best practice for alarm indicators. - can use this statement to solve the second problem `absent(apisix_etcd_reachable{job="Produce-ApiSix"}) == 1`  In summary, I have used temporary solutions to solve it, and I hope to have a long-term stable solution to solve it. @moonming @tao12345666333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org