On Sun, Apr 05, 2009 at 12:07:03AM -0400, David Golden wrote: > * The Metabase for CT 2.0 is organized around > AUTHOR/DISTNAME-VERSION.ARCHIVE (i.e. author + tarball) as the only > real unique ID on CPAN. > > That's easy enough to fix going forward, but it makes importing > history difficult -- and it even makes testing the Metabase difficult > as I have to shave yaks in CPAN::Reporter and Test::Reporter to pass > the full author/tarball path > > My thought: get a full list of all tarballs on backpan create a > mapping table -- hopefully, there are few cases of duplicate > distname-version.
In fact I don't believe there are any. I certainly didn't notice mysql scream about duplicate primary keys as I imported them into my database for the CPobsoleteAN. And don't forget zip files. > Q1: does that exist or could it be produced easily? > Q2: any thoughts on how that could be either kept up to date or > web-queryable for ongoing mapping of "version 1" reports as they are > produced? Importing the metadata for the current backpan is time-consuming but simple. Keeping it up to date is simply a case of running the same script over the backpan every $time_period and ignoring anything that's already in the database. Not sure how much of my code will be relevant, but ... http://www.cantrell.org.uk/cgit/cgit.cgi/cpxxxan/ -- David Cantrell | http://www.cantrell.org.uk/david Erudite is when you make a classical allusion to a feather. Kinky is when you use the whole chicken.
