On Sat, Dec 15, 2012 at 11:59 PM, Michael G Schwern <schw...@pobox.com> wrote: > We have a lot of serious problems because we lack a database of installed > distributions, releases and files. There are serious problems with > implementing one given A) the limitations of the standard Perl install and B) > wedging it into existing systems. But I think I have a solution. Its similar > to how meta data was slipped into the ecosystem without requiring authors to > rewrite their releases or install a bunch of extra modules. It just happens > as part of the normal CPAN module upgrade process. > > I've been thinking that a minimal package database could be created by putting > some hooks into ExtUtils::Install::install(), which every Perl build system > ultimately uses, to record what gets installed. That way when > ExtUtils::Install is upgraded, the user gets a build database without > upgrading everything else. > > This would be a fairly straight forward process at install time... > > 1) Copy everything to a temp directory > 2) Record everything in that temp directory > 3) Copy everything from temp into the real location > > You could probably optimize this by skipping the copy to temp and just have > install() record stuff as it goes by, but this is the dumb, simple, robust way > to do it. > > Storage is a problem. The only reliable "database" Perl ships with is DBM, an > on disk hash, so we can't get too fancy. It might take several DBM files, but > this is enough to record information and do simple queries. What are those > queries? > > * What version of the database is this? > * What distributions are installed? > * What release of a distribution is installed? > * What files are in that release? > * What version is that release? > * What location was a release installed into? (core, vendor, site, custom) > * What are the checksums of those files? > > And the basic operations we need to support. > > * Add a release (ie. install). > * Delete a release (and its files). > * Delete an older version of a release (as part of install). > * Delete an older version of a release, only if its in the same release > location. This is so CPAN installs don't delete vendor installed modules. > * Verify the files of a release. > * List distributions/releases installed. > > It would also store the MYMETA data which gives us a lot of information (such > as dependencies) for free.
I can agree with all of that. Actually, starting a discussion about this was on my todo-list for the last QA hackathon but I didn't get around to it. Ideally, it should replace not only packlists but also perllocal > This is all totally doable, and efficient enough, with a small pile of DBM > files and Storable. Where to put the database is a bit more complicated, see > the list of open problems below. Given that Storable's format isn't forward-compatible, something more stable such as JSON would be more appropriate. > There's lots and lots and lots of additional information which could be stored > and queries and operations to allow, but if we can get the basics working > it'll allow a heap of new solutions. And I think this is a SMOP. > > > Future possibilities include... > > * Auto-upgrade to SQLite if ExtUtils::Install::DB::SQLite is installed. > > If a special module is installed we can offer SQLite support (or whatever) for > a more advanced database. At install time it would copy the existing DBM > system into its own database. > > In general, more functionality can be added as more optional (or bundled) > dependencies are available to the system. Through it all the basic DBM > database would continue to be redundantly maintained to provide a fallback > should those optional modules break or go away. Having a proper database would be really nice, but I'm not sure if it's going to be worth the hassle if we have a robust system already. > * Upgrading the database. > > I'd like to put some thought into how things are laid out initially to avoid a > lot of major revisions, and thought into what information should be recorded > so its available later, but eventually we're going to want to change the > "schema", such as it is with DBM files. > > I figure this can happen as part of upgrading ExtUtils::Install. It checks > what version of the database you have and performs the necessary transforms to > bring it up to the current version. We know how to do this, just have to keep > it in mind and remember to implement it. > > * Where to put the database? What about non-standard install locations? > > $Config{archlib} would seem the obvious location, but it presents a > permissions problem. If a non-root user installs into their home > directory, you don't want them needing root to write to the installation > database. There's several ways to deal with this. > > One is to simply not record non-standard install locations, but this loses > data and punishes all those local::lib users out there. > > Another is to have a separate install database for non-standard install > locations. This makes sense to me, but it brings in the sticky problem > of having to merge install databases. Sticky, but still a SMOP. Once you > have to implement merging anyway, it now makes sense to have an install > database for each install location. One for core. One for vendor. One for > perl. And one for each custom location. This has a lot of advantages to > better fit how Perl layers module installs. > > * allows separation of permissions > * allows queries of what's installed based on what's in @INC > > That second one is important. When a normal user queries the database, they > want to get what's installed in the standard library location. When a > local::lib user queries the database, they want to get what's installed in the > standard library locations AND their own local lib. The combination of these is problematic. You might upgrade EU::Install in your local module path, but not have write permissions on the system paths. In practice, we might have to support all our older versions :-| > Not perfect, but gets us off the ground. Its not a great database, but it > does the important job of recording the critical install-time data for later > use. Its implementable within the current system. It doesn't require a bunch > of dependencies, just one upgrade. It works with most existing module > releases. It solves a major design problem with the Perl module system. > > I think it's a Simple(?!) Matter Of Programming in ExtUtils::Install to get it > off the ground. IMO the most important bit of coordination is putting some > thought into what the basic database should look like so we don't have to > worry about complicated upgrades later. I'm not sure it's as simple as you make it sound, but it is a good idea nonetheless. Leon