lhotari commented on code in PR #1138:
URL: https://github.com/apache/pulsar-site/pull/1138#discussion_r3180358414


##########
docs/administration-autorecovery.md:
##########
@@ -0,0 +1,143 @@
+---
+id: administration-autorecovery
+title: BookKeeper AutoRecovery
+sidebar_label: "BookKeeper AutoRecovery"
+description: Learn how BookKeeper AutoRecovery works in Pulsar, and how to 
configure, enable, disable, and monitor it.
+---
+
+When a bookie in a Pulsar cluster becomes unavailable, the ledgers it stored 
become under-replicated, meaning the data no longer meets the configured 
replication factor. BookKeeper provides **AutoRecovery** to detect this 
situation automatically and rereplicate affected ledgers to healthy bookies — 
without manual intervention.
+
+## How AutoRecovery works
+
+AutoRecovery runs as two concurrent components within the `autorecovery` 
process:
+
+### Auditor
+
+The **Auditor** is a singleton node, elected via ZooKeeper leader election 
from all nodes participating in AutoRecovery. When a bookie fails or is 
reported unavailable, the Auditor:
+
+1. Scans all ledgers to identify those that stored data on the failed bookie.
+2. Publishes rereplication tasks to the `/underreplicated/` ZooKeeper znode.
+
+Only one Auditor is active in a cluster at any time. If the current Auditor 
node fails, a new leader election takes place automatically.
+
+### Replication Worker
+
+A **Replication Worker** runs on every node participating in AutoRecovery. 
Each worker:
+
+1. Monitors the `/underreplicated/` ZooKeeper znode for tasks published by the 
Auditor.
+2. Acquires a ZooKeeper ephemeral lock on an available task (ensuring no two 
workers replicate the same fragment simultaneously).
+3. Copies the under-replicated ledger fragments to its local bookie to restore 
the configured replication factor.
+
+## Deploy AutoRecovery
+
+AutoRecovery can be deployed in two ways:
+
+- **Alongside bookies** — each bookie also runs an AutoRecovery thread 
(default behavior when `autoRecoveryDaemonEnabled=true`).
+- **On dedicated nodes** — separate machines run only AutoRecovery. This is 
recommended for large clusters where you want to isolate recovery I/O from 
normal bookie traffic. Set `autoRecoveryDaemonEnabled=false` on the bookies to 
disable the embedded AutoRecovery thread.
+
+## Start AutoRecovery
+
+To start AutoRecovery as a standalone process in the foreground:
+
+```shell
+bin/bookkeeper autorecovery
+```
+
+To start AutoRecovery as a background daemon:
+
+```shell
+bin/pulsar-daemon start autorecovery
+```
+
+Ensure [`zkServers`](reference-configuration.md#bookkeeper-zkServers) in 
`conf/bookkeeper.conf` points to your ZooKeeper cluster before starting.

Review Comment:
   `zkServers` is a deprecated setting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to