Hi! On Thu, 2015-09-03 at 13:26:12 -0700, Josh Triplett wrote: > Jonas Smedegaard wrote: > > Seems Osamu Aoki is working on at least part of the puzzle: > > https://bugs.debian.org/797045 > > Merging multiple sources *really* shouldn't be necessary. And the > metadata for those sources will vary, so that likely won't save that > much space.
Well, there seems to be different kinds of overhead when it comes to extremely tiny packages (those with dozens or hundreds of lines of code). Metadata is one, amount of packages on the distribution, installed systems and files on the mirrors is another one. All the above involve in one way or another some overhead on at least the amount or size of source packages, binary packages, Sources and Packages indices, package manager databases and possibly increased dependency complexity, usage on disk after installation, inodes used on mirrors or installed systems, number of source VCS, etc. This can have a cost on the mirror network, buildds, on any team doing distribution wide work, such as the ftp-masters, release, porter, QA or reproducible teams, tools like lintian, autopkgtest, DUCK, VCS or watch checkers, britney, botch, etc. On maintainers having to maintain hundreds of similar tiny packages. Doing package collections in Debian might reduce part of the above overhead, but *if* this needs fixing, ideally it should be fixed upstream. Having to package 100 new upstream release updates instead of one is significant work, and that cannot be easily skipped if upstreams do not do the conglomeration themselves. > Perhaps we should add a few more things to common-licenses, or figure > out if our packaging metadata could be further reduced or de-duplicated. > It should be possible to package a 1kB library without several kB of > overhead. There are certain things that we could do to reduce overhead in some places, I don't think we can easily reduce most of the overhead anyway. For example each source and binary package contain a changelog, that's usually what takes most space. Even if we went with my proposal to store that and the copyright files in the dpkg database, that might only reduce some overhead on installed systems. > But even if we have to pay that overhead, so be it; we have > tens of thousands of packages already, what's a few hundred more tiny > JavaScript packages as long as they're actually used? If we were talking about few hundred packages, I don't think anyone would have much of an issue, I guess what people are worried about is this setting precedent and opening the flood gates. That's probably one of the reasons people have not tried to inject much of CPAN or CRAN or similar upstream archives into Debian even if I don't think those are as tiny as the ones proposed here, and most of it could be automated for example. Thanks, Guillem