On Wed, 2012-02-08 at 11:56:06 -0800, Russ Allbery wrote: > Riku Voipio <riku.voi...@iki.fi> writes: > > That is a major waste of space of having multiple copies of identical > > files with different arch-qualified names. Is that really better > > architecture to have multiple copies of identical files on user systems? > > Is it really, though? The files we're talking about are not generally > large. I have a hard time seeing a case where the files would be large > enough to cause any noticable issue and you wouldn't want to move them > into a separate -common or -doc package anyway.
Exactly, in addition this is already an “issue” with lots of packages (regardless of multi-arch) which do not use a common symlinked doc dir. These are some numbers I'm getting on my system (w/ the attached quickly hacked up script), all wild approximations, just to get a feel of it: Approx. installed m-a:same lib waste (w/o -dev,-doc): 20051501 Approx. installed m-a:same lib waste (w/ -dev,-doc): 23310229 Approx. installed m-a:same lib waste per package (23310229 / 293): 79557.09 Approx. predicted lib waste per arch (779 * 79557.09): 61974973.11 Approx. total lib waste per arch (4003 * 79557.09): 318467031.27 So, supposedly, if all possible libs were to be multiarchified I'd waste 60 MiB in case I wanted to have all of them installed for each architecture I enable. Which is not going to be the case. But if it was and 60 MiB were such a problem I could just as well use «dpkg --exclude-path» support. Also I think there's problably some room for improvement which would benefit non-multiarch installations too. For example TODO, USAGE and lots of similar files should be moved to the -dev packages. AUTHORS THANKS and CREDITS files should probably be already represented in copyright, etc. Provably a lintian warning could be introduced for this. regards, guillem
#!/bin/sh echo "List of files that might be candidates to be split out" grep-status -n -sPackage -FMulti-Arch same | \ egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \ egrep -v '(copyright|changelog|NEWS|README)' | \ while read f; do test -f "$f" && printf "$f\0"; done | \ du -bsch --files0-from - waste_libs=$(grep-status -n -sPackage -FMulti-Arch same | \ egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \ while read f; do test -f "$f" && printf "$f\0"; done | \ du -bc --files0-from - | tail -n 1 | cut -f1) echo "Approx. installed m-a:same lib waste (w/o -dev,-doc): $waste_libs" waste_same=$(grep-status -n -sPackage -FMulti-Arch same | \ xargs dpkg -L | grep '\/usr\/share\/' | \ while read f; do test -f "$f" && printf "$f\0"; done | \ du -bc --files0-from - | tail -n 1 | cut -f1) echo "Approx. installed m-a:same lib waste (w/ -dev,-doc): $waste_same" inst_same=$(grep-status -n -sPackage -FMulti-Arch same|wc -l) waste_per_lib=$(echo "scale=2; $waste_same / $inst_same" | bc -l) echo "Approx. installed m-a:same lib waste per package ($waste_same / $inst_same): $waste_per_lib" inst_libs=$(grep-status -n -r -sPackage -FSection libs| \ egrep -v '(common|data|-bin)'| wc -l) waste_inst=$(echo "scale=2; $inst_libs * $waste_per_lib" | bc -l) echo "Approx. predicted lib waste per arch ($inst_libs * $waste_per_lib): $waste_inst" total_libs=$(grep-aptavail -n -r -sPackage -FSection libs| \ egrep -v '(common|data|-bin)'| wc -l) waste_total=$(echo "scale=2; $total_libs * $waste_per_lib" | bc -l) echo "Approx. total lib waste per arch ($total_libs * $waste_per_lib): $waste_total"