epugh commented on code in PR #4171:
URL: https://github.com/apache/solr/pull/4171#discussion_r2962172468


##########
solr/solr-ref-guide/modules/deployment-guide/pages/user-managed-index-replication.adoc:
##########
@@ -575,6 +575,63 @@ A snapshot with the name `snapshot._name_` must exist or 
an error will be return
 `location`::: The location where the snapshot is created.
 
 
+[[monitoring-follower-replication-lag]]
+== Monitoring Follower Replication Lag
+
+In a leader-follower deployment it is important to know whether followers are 
keeping pace with the leader.
+Solr's health-check endpoint supports a `maxGenerationLag` request parameter 
that lets you assert that each follower core is within a specified number of 
Lucene commit generations of its leader.
+When the follower is lagging more than the allowed number of generations the 
endpoint returns HTTP 503 (Service Unavailable), making it straightforward to 
integrate into load-balancer health probes or monitoring systems.
+
+The `maxGenerationLag` parameter is an integer representing the maximum 
acceptable number of commit generations by which a follower is allowed to trail 
its leader.
+A value of `0` requires the follower to be fully up to date.
+If the parameter is omitted, the health check returns `OK` regardless of 
replication lag.
+
+[WARNING]
+====
+Because a follower's generation can only increase when a replication from the 
leader actually completes, `maxGenerationLag=0` may return `FAILURE` 
immediately after a follower starts or after a period of network instability 
even though the follower will catch up on the next poll cycle.

Review Comment:
   yeah, it was generated, and yes, you should DEFINITLY ask about these 
statemetns.   It is documented in the javadocs in `HealthCheckHandler`.   I'll 
be honest, I can't super vouch for this as well since I've never actually 
personally run leader/follower and then used this...    I wonder if we are 
better off not having a WARNING on something that we only know from reading 
javadocs and haven't personally tested?   I.e err on just documenting the "what 
we know", not "what we believe".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to