horizonzy commented on PR #3359: URL: https://github.com/apache/bookkeeper/pull/3359#issuecomment-1167058078
For this case, we already support detect these ledger which ensemble is not adhering placement policy at now. In Auditor, if user config `auditorPeriodicPlacementPolicyCheckInterval`, it will start a scheduled task to trigger `placementPolicyCheck`, In `placementPolicyCheck`, it will record the count of ledger fragment which not adhering placement policy. https://github.com/apache/bookkeeper/blob/677ccec3eb84f5be1b3556537871e14eb5e8359c/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/Auditor.java#L1378 But it only record it to stat, not recover data to make ensemble to adhere placement policy. So we can add a config `repairedPlacementPolicyNotAdheringBookieEnabled` to control is to repaired the data to adhere placement policy. **In Auditor** It will mark ledgerId to unnder replication managed if the ensemble is not adhering placement policy. **In ReplicationWorker** It will move data from old bookie to new bookie which network location is different to adhere placement policy. If there is not bookie with different network location, do nothing. _Attention_ _In ReplicationWoker, it just poll under replicated ledger then process it. So when get an under replicated ledger, we should check two case. 1) Is the ledger fragments loss data. 2) Is the ledger fragments is not adhering placement policy. The one fragment maybe meet both case at the same time. If so, we will ignore case 2, just repaired the data loss. If the repaired result is not adhering the placement policy, the auditor will mark it again._ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
