sodonnel commented on PR #5335:
URL: https://github.com/apache/ozone/pull/5335#issuecomment-1729941570

   Based on the charts in the Jira, this seems like a good change. I am 
surprised it is so much faster.
   
   You said you have about 200 containers per pipeline. Assuming to total 
number of containers for an owner in the system is much larger than that, lets 
say 2000. You need to form a new set of 2000 elements, and then do 200 lookups 
into that set to run the `retainAll()` method.
   
   The current code iterates the 200 and then:
   
   1. Looks up the containerInfo in a map - I'd guess at a similar expense to 
the 200 probes into the map for retain all.
   2. Then for each containerInfo it performs the equals comparison against 
owner.
   
   It would therefore suggest that its the `equals()` calls that are expensive 
and much more so than the lookups into the map.
   
   In the tests you were running, there are 200 containers per pipeline:
   
   1. How many owners are there?
   2. How many open pipelines are there in the cluster?
   3. Do you know the total distribution of pipelines per owner?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to