Demogorgon314 commented on code in PR #21332:
URL: https://github.com/apache/pulsar/pull/21332#discussion_r1355972402


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/extensions/channel/ServiceUnitStateChannelImpl.java:
##########
@@ -503,13 +503,22 @@ public CompletableFuture<Optional<String>> 
getOwnerAsync(String serviceUnit) {
             case Splitting -> {
                 return 
CompletableFuture.completedFuture(Optional.of(data.sourceBroker()));
             }
-            case Assigning, Releasing -> {
+            case Assigning -> {
                 return deferGetOwnerRequest(serviceUnit).whenComplete((__, e) 
-> {
                     if (e != null) {
                         
ownerLookUpCounters.get(state).getFailure().incrementAndGet();
                     }
-                }).thenApply(
-                        broker -> broker == null ? Optional.empty() : 
Optional.of(broker));
+                }).thenApply(broker -> broker == null ? Optional.empty() : 
Optional.of(broker));
+            }
+            case Releasing -> {
+                if (isTransferCommand(data)) {
+                    return deferGetOwnerRequest(serviceUnit).whenComplete((__, 
e) -> {
+                        if (e != null) {
+                            
ownerLookUpCounters.get(state).getFailure().incrementAndGet();
+                        }
+                    }).thenApply(broker -> broker == null ? Optional.empty() : 
Optional.of(broker));
+                }
+                return CompletableFuture.completedFuture(Optional.empty());

Review Comment:
   > But still, doesn't the monitor fix the state and complete the getOwner 
future soon if we make the monitor delay 1sec in this test env?
   
   We need to change the `inFlightStateWaitingTimeInMillis` and 
`ownershipMonitorDelayTimeInSecs` to 1sec, I wonder if they will cause the same 
normal in-flight state to get overridden. Maybe we can delay 5sec in the test. 
   
   Another issue is it will override the `Releasing` -> `Own`, the unload 
result is not expected.
   
   > I am worried that returning the ownership early in this in-flight state 
could cause racing conditions.
   
   I see the ownership cache(old logic) will mark the current state as 
inactive, does it have the same issue?
   
   > Also, it appears that the same issue could happen from transfers too.
   
   Yes, the transfer has the same issue.
   
   > Or, can we skip the ownership check when the ML update fails(if extensible 
lb)? I think it can just get the ledger info again no matter what. When the 
monitor fixes the stuck state, it should clean the ledger again.
   
   Yes, it can work for tests, but still, I'm not sure if that is this unload 
result is correct(`Releasing` -> `Own`).
   
   > Or, could we expose a new internal getOwner api without deferring 
future(this should not be called by lookups)?
   
   The `getOwner` will use the table view state directly?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to