pdolif opened a new pull request, #24854:
URL: https://github.com/apache/pulsar/pull/24854

   Fixes #24837
   
   ### Motivation
   
   The test is flaky. It verifies that lookup requests use the lookup 
properties configured at the time when the request is triggered (and not at the 
time it is executed).
   Therefore, it triggers 10 broker lookups in sequence and changes the lookup 
properties before each. The property "broker.id" is set to "key-0", 
"key-1",..., "key-9" respectively. To verify that the correct properties are 
used, the test uses the `BrokerIdAwareLoadManager`, which extends 
`ExtensibleLoadManagerImpl`. It stores a `CLIENT_ID_LIST`. When `assign()` is 
called, the "broker.id" property is added to the `CLIENT_ID_LIST`. 
   The test waits until all lookup requests finish, and then asserts that 
`CLIENT_ID_LIST` is equal to "key-0", "key-1",..., "key-9". The assertion fails 
if the list contains the values in a different order.
   
   When running the tests, I noticed that the order is either "key-0", 
"key-1",..., "key-9" (ascending) or the reverse order. The reason seems to be 
related to CompletableFuture "callbacks" like `thenCompose()` and their 
execution order. For example, for each of the 10 lookup requests, 
`NamespaceService.getBrokerServiceUrlAsync()` is called (in sequence), which 
contains the following code:
   
   
https://github.com/apache/pulsar/blob/461ffd84d641b855fcc0d09d586ff4b580c21974/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java#L225-L226
   
   When `thenCompose()` is called, it can either be that the 
`getBundleAsync(topic)` CompletableFuture is completed or not. If it is 
completed, `thenCompose()` can be executed immediately for each of the 10 
lookup requests. The test passes. If it is not completed, `thenCompose()` is 
called 10 times but not executed immediately. It seems like the "callbacks" are 
stacked. Once the CompletableFuture completes, the 10 "callbacks" are executed 
in LIFO order, i.e., the one for "key-9" is executed first. Therefore, the 
order in `CLIENT_ID_LIST` will be reversed, and the test fails.
   
   ### Modifications
   
   Make the assertion less strict and ignore the value order.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If the box was checked, please highlight the changes*
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   
   ### Matching PR in forked repository
   
   PR in forked repository: https://github.com/pdolif/pulsar/pull/17


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to