Re: How to get access to ALL the data in maven central?

2012-04-10 Thread Brian Fox
Make a request here and I can attach the poms for you: https://issues.sonatype.org/browse/MVNCENTRAL On Tue, Apr 10, 2012 at 1:17 PM, Wayne Fay wrote: > > If you wanted to scrape Maven Central for just the poms then I'd > > contact Sonatype who manage the central repository. > > As Barrie said,

Re: How to get access to ALL the data in maven central?

2012-04-10 Thread Wayne Fay
> If you wanted to scrape Maven Central for just the poms then I'd > contact Sonatype who manage the central repository. As Barrie said, you could talk to Sonatype (Brian specifically) since they operate the Maven Central repo and they might be able to make a zip file available that would be the r

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Matt Taylor
Actually I think the jars are going to tell me quite a bit. By looking into the class files I should be able to create a link between not only what dependancies are being used by what projects but what methods/classes are being used within each dependency as well. I can then for instance create a

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Barrie Treloar
On Tue, Apr 10, 2012 at 3:42 PM, Matt Taylor wrote: > Answered my own question to a degree.  For the benefit of the group here is > how to do it: > > rsync -a -v --include */ --include *.pom --include *.xml --exclude * > --bwlimit=1000 mirrors.ibiblio.org::maven2/ maven2 > > That will retrieve all

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Matt Taylor
Answered my own question to a degree. For the benefit of the group here is how to do it: rsync -a -v --include */ --include *.pom --include *.xml --exclude * --bwlimit=1000 mirrors.ibiblio.org::maven2/ maven2 That will retrieve all of the pom and xml metadata files for the maven central reposito

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Matt Taylor
Lol, Ron has valid points but I am indeed going forward (and have only myself to blame). Agreed I just need the pom files which are much smaller, but it is still lots of hits on the web server. If anyone knows of a 'nice' way of getting just the pom files that would be good enough for the moment.

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Matt Taylor
I agree it is definitely going to be imperfect and it will in the end only be a sampling of the real usage, but I think that it will still prove interesting information. As far as bogus conclusions reached from others: I plan on putting in some effort into explaining what the results are, what the

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Barrie Treloar
On Tue, Apr 10, 2012 at 12:31 PM, Ron Wheeler wrote: > You are going to be missing the key ingredient which is the application POMs > that tell you what artifacts are actually used. > > You might get some interesting information about things like log4j which is > probably used by lots of things in

Re: How to get access to ALL the data in maven central?

2012-04-09 Thread Ron Wheeler
You are going to be missing the key ingredient which is the application POMs that tell you what artifacts are actually used. You might get some interesting information about things like log4j which is probably used by lots of things inside Maven Central. You will be grossly misled about the use

How to get access to ALL the data in maven central?

2012-04-09 Thread Matt Taylor
Perhaps this is already in existence somewhere. If so please point me in the right direction. I want to know what the most popular dependancies are, not based on downloads, but based on dependancies from other projects. I want to explore the full dependency graph and see its evolution over 'time'