[
https://issues.apache.org/jira/browse/MRESOLVER-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458359#comment-17458359
]
Gregory Ducharme commented on MRESOLVER-133:
--------------------------------------------
Hello folks:
I would like to clarify the use case for the performance aspect of this issue:
Define a component with a large number of dependencies (try 100) with version
ranges like [a.b.c,) or [a.b.c]
Give them in turn a large number of dependencies of the same form.
Just for for fun do the same for transitive dependencies.
And to finish the scenario give each artifact 100 versions .
Think in terms of trunk based development. You only want to deploy the
latest/greatest components to your application.
So the tree of all possible combinations will be huge. Maven will download all
the poms and we see resolution times of 30 minutes. The calculation time in the
resolver code is not the issue. The downloading of all those poms is the time
consumer.
I would expect that a breadth-first solution that terminates upon finding the
first tree path that resolves to a solution would do so long before all poms
were downloaded and all alternatives explored. But maybe I am wrong.
But we have some work arounds:
1) in some cases, we don't use maven at all, we write our own dependency
resolution mechanism using groovy and the maven version comparison jar.
2) refactor the code to reduce inter-dependencies between components.
3) we delete old content from our repositories. Keep the latest 10.
4) we also create bill of material poms that we update daily to limit the
version ranges so that the lowest range is a version built last week.
Work-around 3 is problematic in that some cases, we end up with a mismatch of
the number of versions of a child module compared to the number of versions of
a parent module. This leads to the maven failure described in the original use
case and demonstrated in the code supplied as the test case to this issue.
Work-around 4 is problematic as it turns a number of independent components
into a single monolithic 'build', breaking work-around 2.
Thank-you for your efforts on this issue.
> Improve resolver performance by using breadth-first search
> ----------------------------------------------------------
>
> Key: MRESOLVER-133
> URL: https://issues.apache.org/jira/browse/MRESOLVER-133
> Project: Maven Resolver
> Issue Type: Improvement
> Components: Resolver
> Affects Versions: 1.4.2
> Reporter: Gregory Ducharme
> Priority: Major
> Attachments: mvnbaddeps.zip
>
>
>
> I believe the maven resolver is unnecessarily inefficient because it performs
> a depth-first search of components to resolve dependencies. Consider the case
> when dependencies use version ranges, the user intent is for maven to resolve
> with the highest versions of dependencies that satisfy all constraints. If
> maven were to use a breadth-first search (and terminate searching as soon as
> a solution is found) then in many cases a valid set of dependencies can be
> resolved (at the top of the version ranges) without requiring that all
> historical versions are resolvable. One should get the same answer with both
> depth-first and breadth first strategies, but with the breadth-first approach
> not being vulnerable to a missing parent POM somewhere in history making it
> impossible to build the head of code. Additionally, I suspect that
> breadth-first would be faster and use less memory than depth first.
>
> Additionally the depth-first approach has a weakness that when ny version of
> a parent pom of a component referenced in a dependency tree of another
> component is missing maven fails to resolve dependencies. One get a message
> of the form:
> Failed to execute goal on project module: Could not resolve dependencies for
> project baddepdemo.project2:module:jar:1: Failed to collect dependencies ...
>
> Currently the only way to resolve this issue is one of three ways:
> 1) restore a missing parent POM into the repository history, or
> 2) delete all modules associated with the missing parent POM from the
> repository
> 3) manually adjust version ranges in consumer dependencies to exclude the
> bad versions of dependencies that refer to the missing parent POM.
>
> What I would like is a configuration switch that would allow one to select
> between the two search strategies So that the manual interventions described
> above are not required.
>
> I have include a zip file that include the minimal projects needed to
> demonstrate the dependency resolution problem:
> project 1 has a module and parent pom.
> project 2 is a single pom that has a dependency on the module in project 1.
> Project 2 uses a dependency range [1,) that indicates that the latest version
> of project1's module is to be used.
> If one builds two versions (1 and 2) of project 1, project2 will resolve to
> use version 2 as expected. However if you delete the parent pom of project1
> from the repository maven cannot resolve dependencies and fails. If the
> version range in project 2 is changed to [2,) then the expected behavior is
> observed.
> The zip file contains a shell script (demo.sh) that can be run without
> parameters to demonstrate the behavior when all versions are present in the
> repository. Run it with 1 as a parameter (the lower end of the version range
> used in project2) and the script will delete the parent pom from project 1
> and the error condition will be demonstrated. Run it with 2 and maven will
> resolve dependencies as version1 of project1 is explicitly excluded from the
> dependency resolution process.
>
> I am also willing to look at the source and propose a patch, but I would need
> guidance on which modules/source I should look at.
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)