gnodet edited a comment on pull request #530:
URL: https://github.com/apache/maven/pull/530#issuecomment-919434406


   > I will try to review and understand your change this week.
   
   @michael-o  Let me give you a few details.
   
   I performed a time analysis on the quite big project (1300 modules or so) 
with a `mvnd help:evaluate -Dexpression=project.version` run.  `mvnd` does 
compute the whole graph at the beginning.
   This leads to a call to `getDownstreamProjects()` for each project in the 
build. 
   
   Currently, this means for each call, going through all projects and calling 
`projectIds.contains( ProjectSorter.getId( mavenProject ) )`.  The new version 
caches a few things (the order of the projects so that the dependent projects 
can be sorted without iterating through the whole list of projects), and a 
mapping of `ProjectSorter.getId(x) -> x` to avoid recomputing the ids. In 
addition to those two caches, the loop is changed so that we retrieve the 
projects using a lookup and sort them (instead of iterating through the whole 
list of sorted projects and adding the matching ones).
   
   So, if `N` is the number of projects, this brings down the number of 
iteration from `N * N` to `a * N`, where `a` is the mean number of downstream 
projects. And `a` is usually very small (especially in my case where we only 
get the first level of dependencies between modules and not the transitive 
ones).  In addition, the number of `getId()` calls is down to `N`, which was 
the critical spot.
   
   Hopes this helps.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to