BewareMyPower commented on code in PR #23433:
URL: https://github.com/apache/pulsar/pull/23433#discussion_r1796384654


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/extensions/channel/ServiceUnitStateChannelImpl.java:
##########
@@ -1462,6 +1478,10 @@ private CompletableFuture<Void> 
healthCheckBrokerAsync(String brokerId) {
     }
 
     private void doHealthCheckBrokerAsyncWithRetries(String brokerId, int 
retry, CompletableFuture<Void> future) {
+        if (channelDisabled()) {

Review Comment:
   I added more logs and checked it again. The root cause is that this method 
is still scheduled when the web service is closed.
   
   ```java
                               pulsar.getExecutor()
                                       .schedule(() -> 
doHealthCheckBrokerAsyncWithRetries(brokerId, retry + 1, future),
                                               
Math.min(MAX_BROKER_HEALTH_CHECK_DELAY_IN_MILLIS, retry * retry * 50),
                                               MILLISECONDS);
   ```
   
   Logs:
   
   ```
   2024-10-11T11:28:58,966 - INFO  - 
[pulsar-load-manager-1209-1:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58117 0
   2024-10-11T11:28:58,970 - WARN  - 
[AsyncHttpClient-1250-1:ServiceUnitStateChannelImpl] - XYZ localhost:58117 
Failed to do health check
   2024-10-11T11:28:58,971 - INFO  - 
[pulsar-1210-9:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58117 1
   2024-10-11T11:28:58,972 - WARN  - 
[AsyncHttpClient-1250-1:ServiceUnitStateChannelImpl] - XYZ localhost:58117 
Failed to do health check
   2024-10-11T11:28:59,015 - INFO  - [main:PulsarService] - XYZ start closing 
web service
   2024-10-11T11:28:59,019 - INFO  - [main:PulsarService] - XYZ finished 
closing web service
   2024-10-11T11:28:59,022 - INFO  - 
[pulsar-1210-9:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58117 2
   2024-10-11T11:28:59,040 - WARN  - 
[AsyncHttpClient-1334-2:ServiceUnitStateChannelImpl] - XYZ localhost:58117 
Failed to do health check
   2024-10-11T11:28:59,150 - INFO  - 
[pulsar-load-manager-1251-1:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58158 0
   2024-10-11T11:28:59,157 - WARN  - 
[AsyncHttpClient-1292-1:ServiceUnitStateChannelImpl] - XYZ localhost:58158 
Failed to do health check
   2024-10-11T11:28:59,157 - INFO  - 
[pulsar-1252-9:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58158 1
   2024-10-11T11:28:59,159 - WARN  - 
[AsyncHttpClient-1292-1:ServiceUnitStateChannelImpl] - XYZ localhost:58158 
Failed to do health check
   2024-10-11T11:28:59,213 - INFO  - 
[pulsar-1252-4:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58158 2
   2024-10-11T11:28:59,215 - WARN  - 
[AsyncHttpClient-1292-1:ServiceUnitStateChannelImpl] - XYZ localhost:58158 
Failed to do health check
   2024-10-11T11:28:59,420 - INFO  - 
[pulsar-1252-1:ServiceUnitStateChannelImpl] - XYZ 
doHealthCheckBrokerAsyncWithRetries localhost:58158 3
   ```
   
   So the `doCleanup` should be skipped.
   
   Actually, the key question that since `doCleanup(..., true)` is called 
during the close. Should we skip `doCleanup(..., false)` after that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to