natarajaya opened a new issue #2120: Missing metrics to monitor internal 
replication status
   This is more a general design question than a bug report.
   ## Description
   We are running our CouchDB clusters on GKE.
   Entire setup is very simple: each cluster has 3 nodes, default settings 
   To make sure that our clusters are healthy, we monitor:
   * `/_membership` data on every node, to verify that every node has 
connectivity to other nodes.
   * `couchdb_httpd_request_time` 
 which contains "length of a request inside CouchDB without MochiWeb", to 
verify that every node responds in a reasonable time.
   We are looking to improve our monitoring solution to cover more failure 
modes, and have couple of questions:
   * Is there a way to determine that node is lagging to process writes?
   * After split brain situations (node lost connectivity to other nodes, but 
now connectivity is restored and node is syncing data) – is there a way to 
determine  "replication lag" or, in other words, amount of documents that are 
left to update?
   ## Your Environment
   GKE, CouchDB is installed with Helm using semi-official chart:
   * CouchDB Version used: 2.3.1
   * Browser name and version: None
   * Operating System and version: None, GKE

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

Reply via email to