On 11/05/2009, at 11:12 AM, Brian Fox wrote:

It's time to start looking at the problems with the current 2.x resolution scheme as it specifically relates to repository declaration and discovery.

Sorry for the delay in responding to this, I'm still catching up on May.

I think the first few sections are accurate and complete.

For requirements:

1. maintain the ability for a user to checkout your code and run mvn install and have it work with no prior setup on their part.


+1

2. be able to depend on some jar and not worry about any repositories required for transitive resolution (ie discover the repositories transitively as dependencies are processed) (this is controversial and may be eliminated. First it contributes to the Problem #4 above in that SAT can't be done on a bounded list of repositories. It also doesn't work normally behind a repository manager because the list of repos is usually controlled in the repo manager and thus autodiscovery is intentionally blocked, usually via a mirrorOf * to circumvent the repos maven finds in the poms.)


I think we can achieve this in a way that is compatible with repo managers, depending on the solution (see below)

If we have this though, we need to add a new requirement:
5. builds should be able to add their own alternative versions for artifacts (eg, see xwiki's build that provides a lot of custom versions of standard things), without affecting other builds. So in this case, they would use a custom version to ensure within their build it can override others and contribute to ranges, but its existence in a local repository shouldn't affect other builds.

3. be able to separate the dependencies needed by maven plugins from those needed by the build. This means not only where they are resolved from, but also how they are stored locally to prevent cross- contamination.

I think I would reword this. I can understand wanting to locate plugins separately, and for their repos/deps not to affect the rest of the build, but I'm not sure why local storage matters. A dependency junit:junit:3.8.1 used in a plugin should be the same as that used in a project. Perhaps an alternate/additional requirement is "3. a given artifact coordinate must be always use an identical artifact across a build".

4. Repository identification: at this point we are pretty much in agreement that the URL should be the unique identifier for a repository. People who care about what they are publishing either need to use canonical repositories like Maven central or need to guarantee the existence of the repositories or have decent pointers. In a fully distributed system the relocation mechanism we have does not work in a fully distributed system without a master to manage relocations.


This is a solution, not a requirement :) I think it's clear we need a unique identifier. A URI is a good way to do that, but we need to accommodate that repositories will move too (This was a problem listed earlier). Depending on how we solve the above, it may become less of an issue. So perhaps reword as "repositories must be uniquely identifiable and able to be relocated to a new location over time without affecting existing builds".

I'd then break out artifact relocation as separate requirements:
6. relocating an artifact to a different coordinate must be possible even if that is on a different repository

Stemming from the location I'd add:
7. repositories must be able to be mirrored to different locations and the user select from their choice of closer, identical repository.

Also, probably implied but worth stating:
8. all discovery must be possible without a repository manager installed (though using one can improve the ability to route requests differently)

And finally, maybe implied but worth being explicit about:
9. must work for locating parent projects (this will start giving us better ways to deal with the chicken/egg problem and auto-versioning)

Turning to solutions since it has been a while now... here's some starting points.

I'm tossing around two alternatives in my head:
1) using the repository as the start of the namespace (ie, http://repo1.maven.org/maven2/junit/junit/3.8.1/junit-3.8.1.jar is different to http://repo.otherproject.com/junit/junit/3.8.1/junit-3.8.1.jar) , where the repository contributes to the "version" of the artifact, but is considered the same group/artifact ID for the purpose of resolution. Not that this is just for identification, location needs to be separate. 2) considering group/artifact ID to be globally unique and repository can be derived from that

I'm leaning towards (2) as its shorter notation and easier to understand. Under (1), we'd probably need to be able to add the repository to a dependency element (perhaps with a shorthand notation defined in the pom or its parent

Either way, the resolution mechanism should not be affected by the repositories used. For a given set of artifacts, that should always resolve the same way. The versions available to a range calculation will alter depending on the available repositories, but these should all be known up front in the build. I don't think we need to deal with how version ranges are calculated / made reproducible here (that's being separately dealt with), as long as the above requirements are met with respect to the repositories used for it.

To accommodate this, I think the repositories in the POM should become constrained to locating metadata for a certain set of artifacts, so they can be used to expand reach through resolution, but do not affect anything already encountered, and do not affect resolution outside the current project. As long as the revised (3) above holds, this will be reproducible.

Given 1) , 2), 3), and 5), I think a delegating structure for locating an artifact is the way to go. That is, specifying *only* the <dependency> element is enough for a build to locate an artifact, and always get the same one. The advantages are significant: less configuration/easier set up for new repositories, simpler resolution logic, faster resolution as it never needs to search multiple repositories. The delegation needs to go right down to the version level (snapshots in one repo, releases in another). Then the downside is loss of control (if we point javax to the download.java.net repos automatically, we have to live with that doing dodgy stuff in that namespace like bad POMs or changing released artifacts, or just being down).

I think this can be overcome by layers of routing rules. So, if central becomes the source of pointers to artifacts, then a project can add a repository to locate *missing* ones (not override existing) as described above, then a user can *alter* routes from their settings.xml. A common one for this will be * -> repository manager, but you could have others whether you are using a repo manager or not.

As for local storage, which was mentioned in the requirements, I'm still in favour or this or similar: http://docs.codehaus.org/display/MAVEN/Local+repository+separation . The important part here is that metadata is separated from artifacts and local installations are only used when you intended them to be.

Anyway, just a starting point for discussion, if we can agree on some of the fundamentals I'm sure we can build up a more complete solution.

Cheers,
Brett



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

Reply via email to