This has been a long time coming, and I apologize for the delay. I've
been trying to clear enough time to work on some last touches to the
implementation before writing this, to make sure the description agrees
with the code.
The code for this discussion is in the mirror-group-routing branch of
maven3:
https://svn.apache.org/repos/asf/maven/maven-3/branches/mirror-group-routing
The goal of this branch is to create a mechanism by which the Maven
repository can be fragmented, allowing artifact resolution to be routed
to appropriate repository URLs based on groupId and canonical repository
URL.
The cornerstone of this mapping is the routes file, which will have a
default copy hosted on Maven project infrastructure. The file format
(for now) is JSON. Users will be free to create their own routes file
and point Maven to that instead if they prefer. Alternatively, the
routes file can be generated by a repository manager or really any
application capable of hosting or generating the required JSON file.
So, what's in this routes file? It has two sections: groupId ->
canonical-URL mappings, and canonical-URL -> mirror-URL mappings. These
sections are separate to allow generators to pass-through one set of
mappings while augmenting or replacing the other...for example, a
generator may not want to alter the groupId -> canonical-URL mappings,
but will probably want to generate a custom canonical-URL -> mirror-URL*
map.
To begin with, each groupId can have a canonical repository URL attached
to it in the routing table. This is meant to be the main Maven
repository that hosts artifacts for that groupId. Many open-source
projects will probably list something like:
http://oss.sonatype.org/content/groups/public
as the canonical URL for their groupId, since many open-source projects
use Sonatype OSS to deploy their artifacts.
Given a canonical repository URL for an artifact groupId, the routing
table then matches this canonical URL up to one or more mirror URLs. In
the example above, the default routing table JSON would probably specify
something like (using very rough pseudo-JSON):
"http://oss.sonatype.org/content/groups/public": [
{ "id": "central", "url": "http://repo1.maven.org/maven2/"},
{ "id": "ibiblio", "url":
"http://mirrors.ibiblio.org/pub/mirrors/maven2/" }
]
Ideally, I'd like to see the ASF Maven infrastructure host an
application that allowed any OSS project to register their groupId and
canonical URL, and allow mirror operators to register which canonical
URLs they mirror. This would allow us to delegate administration to
those who have a vested interest in ensuring the correctness of that
metadata. This application could be something very lightweight, like a
CouchDB instance that hosts an HTML- and JS-based application...or, it
could be something more in line with the interests of our developers, if
needed. The point is it's just the routes file we need to host, and it
doesn't really matter how.
To support the demands of groupId and mirror routing, I've had to hack
into the aether-connector-wagon and wrap the WagonRepositoryConnector in
a class that can query the router for the correct URL before delegating
to the main WagonRepositoryConnector. Additionally, I've introduced
modifications and new logic to the MirrorSelector classes used by Maven
(yes, there are two: one in the old legacy code, and another in Aether)
in order to make them routing-aware.
I'm also including a DNS TXT-record-based discovery strategy as well, to
demonstrate the auto-discovery logic. DNS TXT is something I've been
playing around with for my home network, to automatically select my
local Archiva instance when I'm at home, and use the unmirrored
locations outside the house.
I think the implementation is getting pretty close. I'm currently having
trouble with some sort of race condition that's showing up in the ITs.
I'm still going through it, but it's likely to be slow going for awhile
to pin it down, especially since concurrency programming isn't something
I'm expert at (yet). If anyone has an inclination to take a look and see
if they can figure out why ITs are failing intermittently, I'd welcome
the help.
If there's anything in here that doesn't make sense
--
John Casey
Developer, PMC Member - Apache Maven (http://maven.apache.org)
Blog: http://www.johnofalltrades.name/
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]