sodonnel commented on PR #5335: URL: https://github.com/apache/ozone/pull/5335#issuecomment-1729941570
Based on the charts in the Jira, this seems like a good change. I am surprised it is so much faster. You said you have about 200 containers per pipeline. Assuming to total number of containers for an owner in the system is much larger than that, lets say 2000. You need to form a new set of 2000 elements, and then do 200 lookups into that set to run the `retainAll()` method. The current code iterates the 200 and then: 1. Looks up the containerInfo in a map - I'd guess at a similar expense to the 200 probes into the map for retain all. 2. Then for each containerInfo it performs the equals comparison against owner. It would therefore suggest that its the `equals()` calls that are expensive and much more so than the lookups into the map. In the tests you were running, there are 200 containers per pipeline: 1. How many owners are there? 2. How many open pipelines are there in the cluster? 3. Do you know the total distribution of pipelines per owner? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
