heesung-sn commented on code in PR #18858:
URL: https://github.com/apache/pulsar/pull/18858#discussion_r1050377776


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/extensions/channel/ServiceUnitStateChannelImpl.java:
##########
@@ -510,25 +524,120 @@ private CompletableFuture<Integer> 
closeServiceUnit(String serviceUnit) {
                 });
     }
 
-    private CompletableFuture<Void> splitServiceUnit(String serviceUnit) {
-        // TODO: after the split we need to write the child ownerships to BSC 
instead of ZK.
+    private CompletableFuture<Void> splitServiceUnit(String serviceUnit, 
ServiceUnitStateData data) {
+        // Write the child ownerships to BSC.
         long startTime = System.nanoTime();
-        return pulsar.getNamespaceService()
-                .splitAndOwnBundle(getNamespaceBundle(serviceUnit),
-                        false,
-                        
NamespaceBundleSplitAlgorithm.of(pulsar.getConfig().getDefaultNamespaceBundleSplitAlgorithm()),
-                        null)
-                .whenComplete((__, ex) -> {
-                    double splitBundleTime = TimeUnit.NANOSECONDS
-                            .toMillis((System.nanoTime() - startTime));
-                    if (ex == null) {
-                        log.info("Successfully split {} namespace-bundle in {} 
ms",
-                                serviceUnit, splitBundleTime);
-                    } else {
-                        log.error("Failed to split {} namespace-bundle in {} 
ms",
-                                serviceUnit, splitBundleTime, ex);
-                    }
-                });
+        NamespaceService namespaceService = pulsar.getNamespaceService();
+        NamespaceBundleFactory bundleFactory = 
namespaceService.getNamespaceBundleFactory();
+        NamespaceBundle bundle = getNamespaceBundle(serviceUnit);
+        CompletableFuture<Void> completionFuture = new CompletableFuture<>();
+        final AtomicInteger counter = new AtomicInteger(0);
+        this.splitServiceUnitOnceAndRetry(namespaceService, bundleFactory, 
bundle, serviceUnit, data,
+                counter, startTime, completionFuture);
+        return completionFuture;
+    }
+
+    @VisibleForTesting
+    protected void splitServiceUnitOnceAndRetry(NamespaceService 
namespaceService,
+                                                NamespaceBundleFactory 
bundleFactory,
+                                                NamespaceBundle bundle,
+                                                String serviceUnit,
+                                                ServiceUnitStateData data,
+                                                AtomicInteger counter,
+                                                long startTime,
+                                                CompletableFuture<Void> 
completionFuture) {
+        CompletableFuture<List<NamespaceBundle>> updateFuture = new 
CompletableFuture<>();
+
+        getSplitBoundary(bundle).thenAccept(splitBundles -> {
+            // Split and updateNamespaceBundles. Update may fail because of 
concurrent write to Zookeeper.
+            if (splitBundles == null) {
+                String msg = format("Bundle %s not found under namespace", 
serviceUnit);
+                updateFuture.completeExceptionally(new 
BrokerServiceException.ServiceUnitNotReadyException(msg));
+                return;
+            }
+            List<CompletableFuture<Void>> futures = new ArrayList<>();
+            ServiceUnitStateData next = new ServiceUnitStateData(Owned, 
data.broker());
+            for (NamespaceBundle sBundle : splitBundles.getRight()) {
+                futures.add(pubAsync(sBundle.toString(), next).thenAccept(__ 
-> {}));
+            }
+            NamespaceName nsname = bundle.getNamespaceObject();
+            FutureUtil.waitForAll(futures).thenRun(() ->

Review Comment:
   > I am unsure why we need to wait getOwner(children bundles).join(timeout 
).get() == this.broker, because the wait can only ensure the current broker 
state, but if the client does lookup to another broker, it will still have a 
chance to assign the ownership to another broker, since the broker not received 
the Owned message yet. If I understand it correctly.
   
   Your understanding is correct here. `getOwner(children bundles).join(timeout 
).get() == this.broker,` cannot fully prevent the global racing condition here.
   
   Even with `new ServiceUnitStateData(Assigned, data.broker(), 
data.broker())`,  some of the brokers might not see this `Assigned` upon child 
bundle lookup requests.
   
   We should rely on the conflict resolution here to fully resolve this race 
condition.
   
   So, I am fine with the original below `next` without waiting, 
`getOwner(children bundles).join(timeout ).get() == this.broker`
   
   `ServiceUnitStateData next = new ServiceUnitStateData(Owned, data.broker());`
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to