This has been a long time coming, and I apologize for the delay. I've been trying to clear enough time to work on some last touches to the implementation before writing this, to make sure the description agrees with the code.

The code for this discussion is in the mirror-group-routing branch of maven3:

https://svn.apache.org/repos/asf/maven/maven-3/branches/mirror-group-routing

The goal of this branch is to create a mechanism by which the Maven repository can be fragmented, allowing artifact resolution to be routed to appropriate repository URLs based on groupId and canonical repository URL.

The cornerstone of this mapping is the routes file, which will have a default copy hosted on Maven project infrastructure. The file format (for now) is JSON. Users will be free to create their own routes file and point Maven to that instead if they prefer. Alternatively, the routes file can be generated by a repository manager or really any application capable of hosting or generating the required JSON file.

So, what's in this routes file? It has two sections: groupId -> canonical-URL mappings, and canonical-URL -> mirror-URL mappings. These sections are separate to allow generators to pass-through one set of mappings while augmenting or replacing the other...for example, a generator may not want to alter the groupId -> canonical-URL mappings, but will probably want to generate a custom canonical-URL -> mirror-URL* map.

To begin with, each groupId can have a canonical repository URL attached to it in the routing table. This is meant to be the main Maven repository that hosts artifacts for that groupId. Many open-source projects will probably list something like:

http://oss.sonatype.org/content/groups/public

as the canonical URL for their groupId, since many open-source projects use Sonatype OSS to deploy their artifacts.

Given a canonical repository URL for an artifact groupId, the routing table then matches this canonical URL up to one or more mirror URLs. In the example above, the default routing table JSON would probably specify something like (using very rough pseudo-JSON):

"http://oss.sonatype.org/content/groups/public": [
    { "id": "central", "url": "http://repo1.maven.org/maven2/"},
{ "id": "ibiblio", "url": "http://mirrors.ibiblio.org/pub/mirrors/maven2/"; }
]

Ideally, I'd like to see the ASF Maven infrastructure host an application that allowed any OSS project to register their groupId and canonical URL, and allow mirror operators to register which canonical URLs they mirror. This would allow us to delegate administration to those who have a vested interest in ensuring the correctness of that metadata. This application could be something very lightweight, like a CouchDB instance that hosts an HTML- and JS-based application...or, it could be something more in line with the interests of our developers, if needed. The point is it's just the routes file we need to host, and it doesn't really matter how.

To support the demands of groupId and mirror routing, I've had to hack into the aether-connector-wagon and wrap the WagonRepositoryConnector in a class that can query the router for the correct URL before delegating to the main WagonRepositoryConnector. Additionally, I've introduced modifications and new logic to the MirrorSelector classes used by Maven (yes, there are two: one in the old legacy code, and another in Aether) in order to make them routing-aware.

I'm also including a DNS TXT-record-based discovery strategy as well, to demonstrate the auto-discovery logic. DNS TXT is something I've been playing around with for my home network, to automatically select my local Archiva instance when I'm at home, and use the unmirrored locations outside the house.

I think the implementation is getting pretty close. I'm currently having trouble with some sort of race condition that's showing up in the ITs. I'm still going through it, but it's likely to be slow going for awhile to pin it down, especially since concurrency programming isn't something I'm expert at (yet). If anyone has an inclination to take a look and see if they can figure out why ITs are failing intermittently, I'd welcome the help.

If there's anything in here that doesn't make sense

--
John Casey
Developer, PMC Member - Apache Maven (http://maven.apache.org)
Blog: http://www.johnofalltrades.name/

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to