michaeljmarshall commented on pull request #9393:
URL: https://github.com/apache/pulsar/pull/9393#issuecomment-771394148


   @lhotari - thank you for your detailed explanation. I did not consider 
running this on a special environment to simulate the testing env. That is a 
great point, and something I'll keep in mind in the future.
   
   Based on the example logging output you provided, it looks like it is still 
the broker being overridden because it is considered overloaded. See the 
following:
   
   ```
   07:04:53.511 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] INFO  
org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - 2 brokers 
being considered for assignment of 
tenant-c8478edb-886e-43b7-8d43-12cc159c0eb9/use/ns1/0x00000000_0xffffffff
   07:04:53.511 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] WARN  
org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker 
http://localhost:33199 is overloaded: max usage=1.2440309524536133
   07:04:53.511 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] WARN  
org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker 
localhost:33199 is overloaded: CPU: 124.40309%, MEMORY: 19.42325%, DIRECT 
MEMORY: 2.4414062%, BANDWIDTH IN: 0.0%, BANDWIDTH OUT: 0.0%
   07:04:53.516 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] INFO  
org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - 1 brokers 
being considered for assignment of 
tenant-c8478edb-886e-43b7-8d43-12cc159c0eb9/use/ns2/0x00000000_0xffffffff
   07:04:53.516 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] WARN  
org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker 
http://localhost:33199 is overloaded: max usage=1.2440309524536133
   07:04:53.516 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] WARN  
org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker 
localhost:33199 is overloaded: CPU: 124.40309%, MEMORY: 19.42325%, DIRECT 
MEMORY: 2.4414062%, BANDWIDTH IN: 0.0%, BANDWIDTH OUT: 0.0%
   07:04:53.517 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] WARN  
org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker 
http://localhost:33199 is overloaded: max usage=1.2440309524536133
   07:04:53.517 [TestNG-method=testBrokerSelectionForAntiAffinityGroup-1] WARN  
org.apache.pulsar.broker.loadbalance.impl.LeastLongTermMessageRate - Broker 
localhost:33199 is overloaded: CPU: 124.40309%, MEMORY: 19.42325%, DIRECT 
MEMORY: 2.4414062%, BANDWIDTH IN: 0.0%, BANDWIDTH OUT: 0.0%
   ```
   
   Given the first log line, it does look like two brokers are considered for 
placement. However, the CPU is listed at 124%, which leads to an override. I 
mentioned a concern about this in my initial PR message:
   
   > Note that I am assuming the following method will never return a value 
greater than 1, which could lead to test failure.
   
   Perhaps it is worth bumping the limit to something like 300 in this one 
case? If we only have 2 cores, we won't exceed that. Although, that does leave 
us with the potential to see this flakiness again if we ever give the test more 
cores.
   
   Do you think the broker's cpu utilization is high because they are still in 
the process of starting up? If so, perhaps your suggested `await` command could 
help by giving the brokers time to stabilize.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to