Perhaps this is already in existence somewhere. If so please point me in the right direction.
I want to know what the most popular dependancies are, not based on downloads, but based on dependancies from other projects. I want to explore the full dependency graph and see its evolution over 'time' (for instance seeing how fast versions of artifacts are adopted). I want to create a visual representations of all the dependancies just because it would look cool. In general I want total access to all the metadata (pom files essentially) in the maven central repo, so I can see how the worlds software fits together on a 'global' scale. Eventually I would like to explore the jar artifacts as well to get deeper insights into what methods/classes are being referenced as well, but that is phase 2. :) >From googling around is appears that understandably it is improper to simply wget the entire repo. However, there don't seem to be any publicly available torrents, or other resources for me to get access to this data. http://search.maven.org/#stats 457GB is a lot of data, but it isn't an unimaginable amount, and most of that is no doubt the artifacts, not the metadata (pom files). So I really have two questions: 1. What is the easiest path to getting rsync type access of the full repo (I'd quite understand if I needed to pay a fee for this level of access). 2. Failing that, what would be a legitimate way of just getting all the pom files? Basically I want to be a good guy and not put undo load on the servers, but at the same time I really want the data. Thanks, Matt Taylor http://blog.matthewjosephtaylor.com