dlg99 commented on a change in pull request #2813:
URL: https://github.com/apache/bookkeeper/pull/2813#discussion_r725129706



##########
File path: 
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/ScanAndCompareGarbageCollector.java
##########
@@ -231,11 +231,19 @@ public void gc(GarbageCleaner garbageCleaner) {
 
     private Set<Long> removeOverReplicatedledgers(Set<Long> bkActiveledgers, 
final GarbageCleaner garbageCleaner)
             throws InterruptedException, KeeperException {
+        // Check ledger ensembles before creating lock nodes.
+        // This is to reduce the number of lock node creations and deletions 
in ZK.
+        // The ensemble check is done again after the lock node is created.
+        final Set<Long> candidateOverReplicatedLedgers = 
preCheckOverReplicatedLedgers(bkActiveledgers);

Review comment:
       what if there are million+ of active ledgers? 
   In this case it will:
   1. introduce a long delay
   2. by the time it is done, the list of under/over replicated ledgers can 
change anyway because of the bookies coming back up/going down

##########
File path: 
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/ScanAndCompareGarbageCollector.java
##########
@@ -231,11 +231,19 @@ public void gc(GarbageCleaner garbageCleaner) {
 
     private Set<Long> removeOverReplicatedledgers(Set<Long> bkActiveledgers, 
final GarbageCleaner garbageCleaner)
             throws InterruptedException, KeeperException {
+        // Check ledger ensembles before creating lock nodes.
+        // This is to reduce the number of lock node creations and deletions 
in ZK.
+        // The ensemble check is done again after the lock node is created.
+        final Set<Long> candidateOverReplicatedLedgers = 
preCheckOverReplicatedLedgers(bkActiveledgers);
+        if (candidateOverReplicatedLedgers.isEmpty()) {
+            return candidateOverReplicatedLedgers;
+        }
+
         final List<ACL> zkAcls = ZkUtils.getACLs(conf);
         final Set<Long> overReplicatedLedgers = Sets.newHashSet();
         final Semaphore semaphore = new Semaphore(MAX_CONCURRENT_ZK_REQUESTS);
-        final CountDownLatch latch = new 
CountDownLatch(bkActiveledgers.size());
-        for (final Long ledgerId : bkActiveledgers) {
+        final CountDownLatch latch = new 
CountDownLatch(candidateOverReplicatedLedgers.size());
+        for (final Long ledgerId : candidateOverReplicatedLedgers) {
             try {
                 // check if the ledger is being replicated already by the 
replication worker
                 if 
(ZkLedgerUnderreplicationManager.isLedgerBeingReplicated(zk, zkLedgersRootPath, 
ledgerId)) {

Review comment:
       can do something like 
   ```
   if (ZkLedgerUnderreplicationManager.isLedgerBeingReplicated(zk, 
zkLedgersRootPath, ledgerId)
      || !isCandidateOverreplicatedLedger(ledgerId)) {
   ```
   like double-checked locking and avoid initial loop, at the same time reduce 
potential locks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to