sodonnel commented on PR #5335:
URL: https://github.com/apache/ozone/pull/5335#issuecomment-1764262318

   Looking at where this code is called, we call it from 
WriteableRatisContainerProvider, where it calls:
   
   ```
       containerInfo = containerManager.getMatchingContainer(size, owner,
           pipeline, excludeList.getContainerIds());
   ```
   Where it may call getContainersForOwner twice under a synchronised block:
   
   ```
     @Override
     public ContainerInfo getMatchingContainer(final long size, final String 
owner,
         final Pipeline pipeline, final Set<ContainerID> excludedContainerIDs) {
       NavigableSet<ContainerID> containerIDs;
       ContainerInfo containerInfo;
       try {
         synchronized (pipeline.getId()) {
           containerIDs = getContainersForOwner(pipeline, owner);
           if (containerIDs.size() < 
getOpenContainerCountPerPipeline(pipeline)) {
             allocateContainer(pipeline, owner);
             containerIDs = getContainersForOwner(pipeline, owner);
           }
           containerIDs.removeAll(excludedContainerIDs);
           containerInfo = containerStateManager.getMatchingContainer(
               size, owner, pipeline.getId(), containerIDs);
           if (containerInfo == null) {
             containerInfo = allocateContainer(pipeline, owner);
           }
           return containerInfo;
         }
       } catch (Exception e) {
         LOG.warn("Container allocation failed on pipeline={}", pipeline, e);
         return null;
       }
     }
   ```
   
   In the "normal" case, as pipeline will have relatively few containers and 
most cluster I have seen only have a single owner. The cluster described here 
has about 200 containers per pipeline and dozens of owners, so it is very far 
away from usual.
   
   Consider a large cluster with several 1000 open containers, and with many 
open pipelines, so that there are only a small number of containers pipeline, 
lets guess 10.
   
   In the old code path, we would have to iterate 10 times and check the 10 
containers, but in this new code path, we will form a map of all containers in 
the cluster (they all have the same owner), to filter nothing and return.
   
   So my concern is that this change will make the normal code path slower to 
make this unusual path faster. As this code is part of the container selection 
process on write, anything that slows it down is going to be bad.
   
   In fact, when you look at this "usual case" where a cluster only has a 
single owner for everything, even the existing code path is doing a lot of work 
it does not need to do.
   
   It would be worth trying the suggestion from @adoroszlai to probe the 
existing containersByOwner map to see if it performs better than looking up the 
containerInfo and doing a compare on the owner as it does currently.
   
   I was never aware of how the "owner" concept was supposed to be used, but in 
Cloudera we have not had any use cases to use it, and it looks like we pay a 
performance price anyway. Perhaps we need to have a way to turn this off to be 
more efficient if multiple owners are not in use on a cluster.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to