Re: Please test gzip -9n - related to dpkg with multiarch support
Russ Allbery r...@debian.org writes: Goswin von Brederlow goswin-...@web.de writes: Changing the name in the package would break tools that rely on the name (like packages.debian.org extracting the Changelog). Also ugly. We control the tools; we can change the tools. Multiarch is a big deal. We weren't going to get through this without changing some tools. (One should not read that as my support of this specific alternative, as I've not decided there yet, but in general I think it's fair game to change our tools to support multiarch.) One should not read that as my rejection of this specific alternative, as I've not decided there yet, but in general I think it's fair game to change our tools to support multiarch. Problem I have is with the timeframe of such a change. While we can change packages.d.o quite quickly given somebody willing to write the patch we can not quickly change the tools in stable. And I really really really do not want to have to wait yet another release cycle. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87lintzq9k.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek writes (Re: Please test gzip -9n - related to dpkg with multiarch support): And what about adding 700 packages vs. adding no packages at all, in the case of systems which aren't going to have multiarch enabled? One thing that seems to have been overlooked in this discussion of splitting is that splitting packages is not a completely neutral operation, semantically. The package is the unit of installation and upgrade; dependencies do not prevent a package from being on the system for a considerable time with its dependencies violated. Or to put it another way: if currently libfoo1 (1.1) contains and needs /usr/share/libfoo1/foo-data-1.1 libfoo1 (1.2) contains and needs /usr/share/libfoo2/foo-data-1.2 then splitting the foo-data out into libfoo1-data (1.1) -depends- libfoo1 (1.1) libfoo1-data (1.2) -depends- libfoo1 (1.2) means that when the libfoo packages are upgraded, there will be a substantial period when we have /usr/lib/libfoo1.so.1.2 installed and the symlink libfoo1.so.1 points to it, but /usr/share/libfoo2/foo-data-1.2 is missing. If the packages are not split, dpkg will unpack /usr/share/libfoo2/foo-data-1.2 before overwriting the old libfoo1.so.1 symlink. So normally, our current arrangements mean that shared libraries continue to work throughout an upgrade. Splitting shared data out like this may make this no longer true (for some unknown set of packages). This issue is very important for essential packages, but in general it's not a good idea to introduce additional sources of skew. For an essential package the problem can be solved with a Pre-Depends but the result has to look like this: libfoo1-data1.1 (1.1) -depends- libfoo1 (1.1) libfoo1-data1.2 (1.2) -depends- libfoo1 (1.2) So I think the refcounting in dpkg is the best option for these files. Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20280.65534.54525.328...@chiark.greenend.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
I wrote: Or to put it another way: if currently libfoo1 (1.1) contains and needs /usr/share/libfoo1/foo-data-1.1 libfoo1 (1.2) contains and needs /usr/share/libfoo2/foo-data-1.2 then splitting the foo-data out into libfoo1-data (1.1) -depends- libfoo1 (1.1) libfoo1-data (1.2) -depends- libfoo1 (1.2) means that when the libfoo packages are upgraded, there will be a substantial period when we have /usr/lib/libfoo1.so.1.2 installed and the symlink libfoo1.so.1 points to it, but /usr/share/libfoo2/foo-data-1.2 is missing. ... or vice versa, of course. How long this situation persists will vary. Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20281.91.957041.390...@chiark.greenend.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
Ian Jackson ijack...@chiark.greenend.org.uk writes: I wrote: Or to put it another way: if currently libfoo1 (1.1) contains and needs /usr/share/libfoo1/foo-data-1.1 libfoo1 (1.2) contains and needs /usr/share/libfoo2/foo-data-1.2 then splitting the foo-data out into libfoo1-data (1.1) -depends- libfoo1 (1.1) libfoo1-data (1.2) -depends- libfoo1 (1.2) means that when the libfoo packages are upgraded, there will be a substantial period when we have /usr/lib/libfoo1.so.1.2 installed and the symlink libfoo1.so.1 points to it, but /usr/share/libfoo2/foo-data-1.2 is missing. ... or vice versa, of course. How long this situation persists will vary. Ian. I would say that if it is critical for the lib, esspecially essential ones, that the data exists then it should be arch qualified and kept in the library package, even if that means duplicating it in the archive and the users system. Don't forget that splitting isn't the only options. We can combine multiple ways to deal with this. I don't beleave any single way will work for all packages so lets find a good combination of things so that everybody is happy. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87r4xyspgm.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek vor...@debian.org writes: On Thu, Feb 09, 2012 at 01:40:41PM +0100, Goswin von Brederlow wrote: Steve Langasek vor...@debian.org writes: - For many of these files, it would be actively harmful to use architecture-qualified filenames. Manpages included in -dev packages should not change names based on the architecture; having /usr/share/pam-config contain multiple files for the same profile, one for each architecture of the package that's installed, would not work correctly; etc. Appropos pam config. Shouldn't that be arch-qualified (which includes /etc)? No, it should not. These files are input for a central, shared, common PAM configuration meant to be usable by all services on the system. If you have a foreign-arch PAM-using service installed, but you don't have the foreign-arch versions of the PAM modules that are referenced by /etc/pam.d/common-*, that's a bug: the module packages should be installed for all archs, not just a subset[1]. The system-level authentication configuration should not vary based on the architecture of the binary! And if you happen to have a foreign-arch service for which you don't want to use the same set of modules, well, your service's config file doesn't have to include /etc/pam.d/common-* - but then that's a special case of a service that you don't want to use the common config, it's not something we should assume by default in multiarch. Ok, that is acceptable. We just lack any technical means to ensure this so far. Same problem as for input method plugins for example. Say I have pam modules for ldap installed for amd64 but not for armel? Why would you do that, except by accident? MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87aa4pe546.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Cyril Brulebois dixit: For those not subscribed to that bug, how to reproduce[1] and possible fix[2] are available now. There might be other places where buffers are reused, I only spent a few minutes on this during my lunch break. Your lunch breaks are amazing. Doesn’t this look like “uses uninitiali‐ sed memory” somehow? Has anyone run valgrind over GNU gzip? Possibly this is “zeroes memory once, then assumes it’s zeroed”? I fear your patch may be only hiding/masking the bug… bye, //mirabilos -- ch you introduced a merge commit│mika % g rebase -i HEAD^^ mika sorry, no idea and rebasing just fscked │mika Segmentation ch should have cloned into a clean repo │ fault (core dumped) ch if I rebase that now, it's really ugh │mika:#grml wuahh -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pine.bsm.4.64l.1202112151560.27...@herc.mirbsd.org
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek vor...@debian.org writes: On Thu, Feb 09, 2012 at 10:29:53PM +0100, Guillem Jover wrote: But the more interesting slowdown is that the amount of packages is general slows down apt operations in a rate that is around O(dependencies^2) (pure guess, perhaps someone has better knowledge?). We do remember apt-get slowing down to crawl on maemo platforms with much smaller repositories.. Well, if we take the number of new packages Steve quoted (even w/o taking into account the stuff I mentioned that could be reduced), and round it to 200 new packages, that's really insignificant compared to the amount of packages one will inject into apt per new foreign arch configured. I really fail to see the issue here. That's based on a sample of 1200 packages currently tagged Multi-Arch: same in the Ubuntu precise archive. If we have all packages in sections libs and libdevel converted for multiarch (which I suppose we eventually will), this number will be closer to 7000. Does 700 more of these support packages approach the level that it starts to be a problem? Currently we have 36706 packages in main/contrib/non-free amd64 sid. Adding 700 would be an increase of less than 2%. If the speed decrease is linear then that isn't a problem. If it is quadratic or even exponential that could be different. But then we have bigger problems as the number of packages does increase as more packages are added to Debian by probably more than 700 till wheezy+1. And doubling (tripling, quadrupling) the number of packages (as multiarch system do) would totaly kill the performance. Since apt still works even with 4 archs I strongly doubt the O(dependencies^2) guess. Adding 700 packages compared to adding 36706 (73412, 110118) for multiarch seems still minor. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87pqdmhpvb.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
On Fri, Feb 10, 2012 at 12:47:20PM +0100, Goswin von Brederlow wrote: That's based on a sample of 1200 packages currently tagged Multi-Arch: same in the Ubuntu precise archive. If we have all packages in sections libs and libdevel converted for multiarch (which I suppose we eventually will), this number will be closer to 7000. Does 700 more of these support packages approach the level that it starts to be a problem? Adding 700 packages compared to adding 36706 (73412, 110118) for multiarch seems still minor. And what about adding 700 packages vs. adding no packages at all, in the case of systems which aren't going to have multiarch enabled? This would impact systems of all archs, not just those for which multiarch is a significant use case. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
* Guillem Jover guil...@debian.org, 2012-02-09, 03:45: But anyway, I believe that in the long run we should simply deprecate compressing stuff in /usr/share/doc/. So the main reason people are arguing for shared files boils down to used size, either in installed files, or Packages files, etc, I don't know what is the main reason for other people. (But I doubt it's about saving space as you're trying to imply.) yet you want to fix the compression issue by not compressing them and using even more space? Is that surprising to you that different people can advocate the very same thing for different reasons? I think that compressing documentation is an anachronism, which we should get rid of regardless of whether it helps multi-arch or not. -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120210163034.ga3...@jwilk.net
Re: Please test gzip -9n - related to dpkg with multiarch support
Bastian Blank writes (Re: Please test gzip -9n - related to dpkg with multiarch support): On Thu, Feb 09, 2012 at 11:45:52AM -0400, Joey Hess wrote: And then if I have a multiarch system, and want to locally download the source of some library, build it and install it, dpkg will complain if I didn't use the same gzip that was used to build other arch versions I have installed. dpkg would complain anyway, because the versions are different. (a) You could make them not be. Eg just download the existing source code and tweak it and rebuild and install. (b) There will surely be some --force- option you can use to override this and it should not be necessary to also override problems involving different versions of the same files. Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20277.20318.397042.776...@chiark.greenend.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek vor...@debian.org writes: And what about adding 700 packages vs. adding no packages at all, in the case of systems which aren't going to have multiarch enabled? This would impact systems of all archs, not just those for which multiarch is a significant use case. I'm having a really hard time getting excited about 700 packages. I haven't checked the numbers, but I would be surprised if we don't add more than twice that number during normal development between stable releases. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87y5saef5l@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 01:40:41PM +0100, Goswin von Brederlow wrote: Steve Langasek vor...@debian.org writes: - For many of these files, it would be actively harmful to use architecture-qualified filenames. Manpages included in -dev packages should not change names based on the architecture; having /usr/share/pam-config contain multiple files for the same profile, one for each architecture of the package that's installed, would not work correctly; etc. Appropos pam config. Shouldn't that be arch-qualified (which includes /etc)? No, it should not. These files are input for a central, shared, common PAM configuration meant to be usable by all services on the system. If you have a foreign-arch PAM-using service installed, but you don't have the foreign-arch versions of the PAM modules that are referenced by /etc/pam.d/common-*, that's a bug: the module packages should be installed for all archs, not just a subset[1]. The system-level authentication configuration should not vary based on the architecture of the binary! And if you happen to have a foreign-arch service for which you don't want to use the same set of modules, well, your service's config file doesn't have to include /etc/pam.d/common-* - but then that's a special case of a service that you don't want to use the common config, it's not something we should assume by default in multiarch. Say I have pam modules for ldap installed for amd64 but not for armel? Why would you do that, except by accident? -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org [1] As discussed previously on this list, we don't currently have a mechanism to ensure that these modules are installed for all archs, so this is not an unlikely bug. But in the worst case, it's also a very straightforward bug to fix by just installing the missing package. signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 03:45:50AM +0100, Guillem Jover wrote: While this could benefit the multiarch installations (for which they can easily use --path-exclude), it would use lots more space on single arch installations. Does it really? A quick test tells me that uncompressing every file under /usr/share/doc does indeed increase the size of that directory on my laptop by a factor of approximately two: After running sudo find /usr/share/doc -name '*.gz' -exec gunzip {} \;, the size of that directory is as reported by 'du -s' is 1263220 kibibytes, while it was 757280 before, a difference of 505940. This is on a system with 2524 packages installed, for a grand total of... dpkg-query -W -f '${Installed-Size}\n' | awk '{TOT+=$0} END{print TOT}' 8830371 ... approximately 8.5GiB of installed software. While I agree that adding around 500MiB to that installed size is significant, I wouldn't define it as 'lots more space'. Additionally, it should be possible for dpkg to support compressing at install time for those users who request it, based on a configuration parameter. -- The volume of a pizza of thickness a and radius z can be described by the following formula: pi zz a signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Josh Triplett j...@joshtriplett.org writes: The only downside that I can see: packages couldn't refer to a particular file under /usr/share/doc/$package/ by path, because those packages wouldn't know how the administrator might choose to compress their files. Given the policy of not depending on files under /usr/share/doc/ to function, at most this will result in manpages and similar referencing paths that then need a .gz or .xz appended, and that doesn't seem like a big deal; people will cope and tools can learn to check for compressed variants. We already have this situation with dh_compress, which compresses files if it saves space. I had cases where between releases a file would end up sometimes compressed and sometimes not. And that file was mentioned in the README. I decided to just refer to the file without .gz extention and figured users would be smart enough to find it. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87ehu4s2xc.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Adam Borowski kilob...@angband.pl writes: On Wed, Feb 08, 2012 at 02:14:22PM +0100, Cyril Brulebois wrote: For those not subscribed to that bug, how to reproduce[1] and possible fix[2] are available now. There might be other places where buffers are reused, I only spent a few minutes on this during my lunch break. 2. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=647522 Even if you ensure a particular build behaves exactly the same on a given architecture, you're merely introducing future problems. gzip's output is likely to change: * on a new version Yes, but not a big problem (other than a small race condition) since all buildds should have the same version. * after a bugfix (including security ones) Yes, but not a problem (other than a small race condition) since all buildds should have the same version. * on a different architecture No. I consider that a bug. * with different optimizations Not a problem. * with a different implementation (like those parallel ones) Not a problem (yet). We only have one gzip. pigz doesn't replace gzip. * possibly with a different moon phase No. I consider that a bug. Especially the first is pretty guaranteed to bite: whenever the upstream does a small improvement, binaries in the archive get invalidated until rebuilt with the new gzip. Not true. Packages only break if they are build with one gzip on one arch and another on other archs. On gzip uploads there is a window where archs will have different gzip versions so this is of some concern. But not as bad as you make it look. Breaking the ideas for diverting /bin/gzip by pigz is not nice, too. True. But why should gzip and pigz give different output? They should be able to result in the same compressed output. I think for pigz one problem is where to split the input. Making it split at the same points as gzip --rsyncable does (and using that option in gzip) could be a solution. Or files in /usr/share/doc (where we have the collisions) could be compressed with /usr/bin/gzip.gzip (assuming that would be the name of the real binary providing the gzip alternative). MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87aa4ss20y.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 01:33:58AM +, Wookey wrote: Some of the issues are already clear I think (moving arch-dependent headers into arch-qualified dirs, but leaving the others where they are) And what is considered the best way to share the architecture–independent headers between M-A: same -dev packages? Install them in all packages, and let dpkg handle the conflicts? I still don’t feel very confortable doing so, and it would cause an increase in the archive size. Or ship them in a separate Arch: all -dev-common package, which all the other -dev packages depend on? This would ensure a single copy of the headers is present in the archive, but it would also add yet another binary package per library to the Packages file… -- Andrea Bolognani e...@kiyuko.org Resistance is futile, you will be garbage collected. signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Guillem Jover guil...@debian.org writes: On Wed, 2012-02-08 at 22:01:23 +0100, Jakub Wilk wrote: In practice, the only compressor we need to care is gzip, which is not actively maintained upstream[0]. Chances that a new version of it will break large number of packages are minute. That assumes that we will never want to switch to a better/faster compressor for any gzip compressed file. Or that there's no existing files compressed with anything other than gzip. But anyway, I believe that in the long run we should simply deprecate compressing stuff in /usr/share/doc/. So the main reason people are arguing for shared files boils down to used size, either in installed files, or Packages files, etc, yet you want to fix the compression issue by not compressing them and using even more space? While this could benefit the multiarch installations (for which they can easily use --path-exclude), it would use lots more space on single arch installations. Also splitting files into new arch:all packages should usually reduce archive size usage for example. regards, guillem There are 2 cases to consider here: 1) Lots of data in /usr/share/doc/ Please do split them into an arch:all package. 2) Tiny amount of data in /usr/share/doc/ Policy requires that we have a Changelog there and those are usualy large enough to benefit from compression but not large enough to warant their own -common package. Adding one or two other small files as docs usualy doesn't pass the thresshold for splitting them into -common too. Now those are our problem cases where we need identical compression. But if it is such a tiny amount then keeping it uncompressed should be reasonably fine. Systems where those few extra kib make a difference probably want to --path-exclude /usr/share/doc. There could also be a new option --path-compress compresor path that would compress any file below that path with the given compressor. So my suggestion would be twofold: 1) encourage splitting stuff into -common packages or 2) leave files uncompressed in MA:same packages MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/8762fgry94.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Ian Jackson ijack...@chiark.greenend.org.uk writes: Russ Allbery writes (Re: Please test gzip -9n - related to dpkg with multiarch support): Another possible solution is to just give any package an implicit Replaces (possibly constrained to /usr/share/doc) on any other package with the same name and version and a different architecture. This isn't as defensive, in that it doesn't catch legitimate bugs where someone has made a mistake and the packages contain different contents, but it also solves the binNMU issue (well, solves; the changelog will randomly swap back and forth between the packages, but I'm having a hard time being convinced this is a huge problem). Well, it does mean that you might be lacking important information because the other changelog wouldn't be present on the system. One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Ian. Or dpkg could do that for you. At least for files in /usr/share/doc when there is a collision. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/871uq4rxzq.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek vor...@debian.org writes: - For many of these files, it would be actively harmful to use architecture-qualified filenames. Manpages included in -dev packages should not change names based on the architecture; having /usr/share/pam-config contain multiple files for the same profile, one for each architecture of the package that's installed, would not work correctly; etc. Appropos pam config. Shouldn't that be arch-qualified (which includes /etc)? Say I have pam modules for ldap installed for amd64 but not for armel? MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87ty30qiwm.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Goswin von Brederlow writes (Re: Please test gzip -9n - related to dpkg with multiarch support): Ian Jackson ijack...@chiark.greenend.org.uk writes: One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Or dpkg could do that for you. At least for files in /usr/share/doc when there is a collision. Urgh, I think this is a really ugly idea, compared to just having the packages contain the arch-specific filenames. After all a multiarch:same package knows that it is, and can DTRT. Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20275.48825.356091.885...@chiark.greenend.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
Wouter Verhelst wou...@debian.org writes: On Thu, Feb 09, 2012 at 03:45:50AM +0100, Guillem Jover wrote: While this could benefit the multiarch installations (for which they can easily use --path-exclude), it would use lots more space on single arch installations. Does it really? A quick test tells me that uncompressing every file under /usr/share/doc does indeed increase the size of that directory on my laptop by a factor of approximately two: After running sudo find /usr/share/doc -name '*.gz' -exec gunzip {} \;, the size of that directory is as reported by 'du -s' is 1263220 kibibytes, while it was 757280 before, a difference of 505940. This is on a system with 2524 packages installed, for a grand total of... dpkg-query -W -f '${Installed-Size}\n' | awk '{TOT+=$0} END{print TOT}' 8830371 ... approximately 8.5GiB of installed software. While I agree that adding around 500MiB to that installed size is significant, I wouldn't define it as 'lots more space'. Additionally, it should be possible for dpkg to support compressing at install time for those users who request it, based on a configuration parameter. Note that only a fraction of that would be in MA:same packages. Everything else can stay compressed. Some other test (see other mails in thread) estimated an increase of 60MB. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87y5scp0ws.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Ian Jackson ijack...@chiark.greenend.org.uk writes: Goswin von Brederlow writes (Re: Please test gzip -9n - related to dpkg with multiarch support): Ian Jackson ijack...@chiark.greenend.org.uk writes: One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Or dpkg could do that for you. At least for files in /usr/share/doc when there is a collision. Urgh, I think this is a really ugly idea, compared to just having the packages contain the arch-specific filenames. After all a multiarch:same package knows that it is, and can DTRT. Ian. Changing the name in the package would break tools that rely on the name (like packages.debian.org extracting the Changelog). Also ugly. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87ty30p0re.fsf@frosties.localnet
Re: Please test gzip -9n - related to dpkg with multiarch support
Goswin von Brederlow wrote: * after a bugfix (including security ones) Yes, but not a problem (other than a small race condition) since all buildds should have the same version. And then if I have a multiarch system, and want to locally download the source of some library, build it and install it, dpkg will complain if I didn't use the same gzip that was used to build other arch versions I have installed. -- see shy jo signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 03:06:46PM +0100, Adam Borowski wrote: gzip's output is likely to change: * on a different architecture * with different optimizations If either of these are the case (assuming a valid, deterministic, non-arch-specific implementation) then this violates C's as-if rule. The compiled version has to act as if it did exactly what the C said. Optimizations or other transformations that cause the compiled code to violate this are a bug in the compiler. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 11:45:52AM -0400, Joey Hess wrote: And then if I have a multiarch system, and want to locally download the source of some library, build it and install it, dpkg will complain if I didn't use the same gzip that was used to build other arch versions I have installed. dpkg would complain anyway, because the versions are different. Bastian -- The sight of death frightens them [Earthers]. -- Kras the Klingon, Friday's Child, stardate 3497.2 -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209162132.ga23...@wavehammer.waldi.eu.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 03:34:28AM +0100, Guillem Jover wrote: Riku mentioned as an argument that this increases the data to download due to slightly bigger Packages files, but pdiffs were introduced exactly to fix that problem. And, as long as the packages do not get updated one should not get pdiff updates. And with the splitting of Description there's even less data to download now. off-topic but often pdiffs don't really speed up apt-get update. Added roundtrip time latency on pulling several small files slows down the download unless you run update nightly. But the more interesting slowdown is that the amount of packages is general slows down apt operations in a rate that is around O(dependencies^2) (pure guess, perhaps someone has better knowledge?). We do remember apt-get slowing down to crawl on maemo platforms with much smaller repositories.. Adding shared file support into dpkg, introduces additional uneeded complexity, that can never be taken out, and which seems clear to me should be dealt with at the package level instead. However, if we add the complexity to dpkg, we don't need to add it to all of the 1000+ multiarched packages. It would not be wise to do something 1000 times in packages which could be done once in dpkg. Even when doing it once in dpkg is harder, it is still a lot less total work. Since Debian has chronic lack of active hands working on packages, solutions that add the workload of maintainers will just slow down the development of debian even further. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209195017.ga13...@afflict.kos.to
Re: Please test gzip -9n - related to dpkg with multiarch support
Goswin von Brederlow goswin-...@web.de writes: Changing the name in the package would break tools that rely on the name (like packages.debian.org extracting the Changelog). Also ugly. We control the tools; we can change the tools. Multiarch is a big deal. We weren't going to get through this without changing some tools. (One should not read that as my support of this specific alternative, as I've not decided there yet, but in general I think it's fair game to change our tools to support multiarch.) -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87k43vu389@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, 2012-02-09 at 21:50:17 +0200, Riku Voipio wrote: On Thu, Feb 09, 2012 at 03:34:28AM +0100, Guillem Jover wrote: Riku mentioned as an argument that this increases the data to download due to slightly bigger Packages files, but pdiffs were introduced exactly to fix that problem. And, as long as the packages do not get updated one should not get pdiff updates. And with the splitting of Description there's even less data to download now. off-topic but often pdiffs don't really speed up apt-get update. Added roundtrip time latency on pulling several small files slows down the download unless you run update nightly. One of the reasons of this I think, is that the current pdiff implementation in apt is really not optimal, see #372712. But the more interesting slowdown is that the amount of packages is general slows down apt operations in a rate that is around O(dependencies^2) (pure guess, perhaps someone has better knowledge?). We do remember apt-get slowing down to crawl on maemo platforms with much smaller repositories.. Well, if we take the number of new packages Steve quoted (even w/o taking into account the stuff I mentioned that could be reduced), and round it to 200 new packages, that's really insignificant compared to the amount of packages one will inject into apt per new foreign arch configured. I really fail to see the issue here. Adding shared file support into dpkg, introduces additional uneeded complexity, that can never be taken out, and which seems clear to me should be dealt with at the package level instead. However, if we add the complexity to dpkg, we don't need to add it to all of the 1000+ multiarched packages. It would not be wise to do something 1000 times in packages which could be done once in dpkg. Even when doing it once in dpkg is harder, it is still a lot less total work. Since Debian has chronic lack of active hands working on packages, solutions that add the workload of maintainers will just slow down the development of debian even further. If this was something that dpkg could do reliably, was future-proof, introduced no issues at all and was technically sound, then I'd agree with you that even if it might be harder to implement (which is not the case), and maintain (maybe) it would be well worth it. But given the amount of problems, inconsitent handling between M-A: same and other packages, corner cases and general fragility it introduces, for the supposed benefit of size reduction (which does not seem to be significant at all) and to avoid a possible one time package split, it seems clear this is the complete wrong approach. Also except for the package splits, most of the arch-qualified path changes should be easily handled automatically by something like debhelper or cdbs. In any case when I was talking about complexity here, was not code wise, but the implications it has on the handling of packages in general. I'll write more about this in a summary mail I'm finishing up. regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209212953.ga26...@gaara.hadrons.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 10:29:53PM +0100, Guillem Jover wrote: But the more interesting slowdown is that the amount of packages is general slows down apt operations in a rate that is around O(dependencies^2) (pure guess, perhaps someone has better knowledge?). We do remember apt-get slowing down to crawl on maemo platforms with much smaller repositories.. Well, if we take the number of new packages Steve quoted (even w/o taking into account the stuff I mentioned that could be reduced), and round it to 200 new packages, that's really insignificant compared to the amount of packages one will inject into apt per new foreign arch configured. I really fail to see the issue here. That's based on a sample of 1200 packages currently tagged Multi-Arch: same in the Ubuntu precise archive. If we have all packages in sections libs and libdevel converted for multiarch (which I suppose we eventually will), this number will be closer to 7000. Does 700 more of these support packages approach the level that it starts to be a problem? -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 9, 2012 at 22:29, Guillem Jover guil...@debian.org wrote: On Thu, 2012-02-09 at 21:50:17 +0200, Riku Voipio wrote: On Thu, Feb 09, 2012 at 03:34:28AM +0100, Guillem Jover wrote: Riku mentioned as an argument that this increases the data to download due to slightly bigger Packages files, but pdiffs were introduced exactly to fix that problem. And, as long as the packages do not get updated one should not get pdiff updates. And with the splitting of Description there's even less data to download now. off-topic but often pdiffs don't really speed up apt-get update. Added roundtrip time latency on pulling several small files slows down the download unless you run update nightly. One of the reasons of this I think, is that the current pdiff implementation in apt is really not optimal, see #372712. The real slowdown is that APT currently works on one pdiffs at the time. The solution for this is two-fold: First get all pdiffs needed - for debian this is easy as its strictly sequential, but other archives can (and some even do) use different paths so we need a bit more metadata to support these, too. After we have all these pdiffs we can merge these to one big pdiff and apply this one. As we walk over 25 MB files only once and not for each patch we should be quiet a bit faster. The theory and even python code for the merge part can be found at [0], it's just that the APT team is since years so overcrowded that we haven't yet decided who can pick this one [/irony]. If someone wants to work on that, feel free to drop a line to deity@l.d.o (and to Anthony) and i will try to help if time permits. [0] http://lists.debian.org/deity/2009/08/msg00169.html But the more interesting slowdown is that the amount of packages is general slows down apt operations in a rate that is around O(dependencies^2) (pure guess, perhaps someone has better knowledge?). My question would be why you are guessing O(d^2) for a situation which should be intuitively a O(d*2). My empirical testing seems to support this, given that the runtime roughly doubles (a bit less) (Less than doubled packages as we have arch:all packages, but a bit more than doubled deps given that we have new implicit ones for multiarch). But as team member and implementer of multiarch in APT i might be a bit biased here… ;) (note though that numbers/timing are based on experimental, sid has currently a slightly different implementation, but shouldn't be that bad either) We do remember apt-get slowing down to crawl on maemo platforms with much smaller repositories.. As an owner of an N810 i am not, but i might be used to pain, given that i managed bootstrapping Debian with a recent (partly working) kernel on it (the gentoo/openwrt have details on that if someone is interested). So if you can go into details what you remember exactly we might be able to work on it - until then, my only comment to adding more packages: What should possible go wrong? ;) If APT survives i386 packages in amd64, it might survive some new ones, too. Best regards David Kalnischkies -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/caaz6_fa8ujbu23_4qqhlqrcgmiu35mcsifk+d-vh-msqg2s...@mail.gmail.com
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 03:06:46PM +0100, Adam Borowski wrote: gzip's output is likely to change: * on a new version * after a bugfix (including security ones) * on a different architecture * with different optimizations * with a different implementation (like those parallel ones) * possibly with a different moon phase Especially the first is pretty guaranteed to bite: whenever the upstream does a small improvement, binaries in the archive get invalidated until rebuilt with the new gzip. Checking with a of 82613 files, compressing each file with gzip -n9 $file resulted exactly same result with gzip of oldstable (2008) and gzip of sid (2012). Picking up woody gzip and libc from archive.debian.org (2002), 33 files were not identical. This a 0.04% chance of differing file over a DECADE of changes. While evidently changes do happen, certainly not a case of whenever a butterly flaps wings in a certain direction gzip compression results change. specifics of the the test setup: 2012 sid setup, gzip 1.4-2, eglibc 2.13-20, built with gcc 4.6.1 amd64 binaries 2008 lenny setup, gzip 1.3.12-6+lenny1, glibc6 2.7-18lenny7, built with gcc 4.3, i386 binaries 2002 woody setup, gzip 1.2.4-33.1, glibc6 2.1.3-20, probably built with gcc 3.0, i386 binaries (duh) Corpus of files was all the gzip compressed files extracted and uncompressed from a partial debian mirror. All tests during the current almost full moon phase. Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120210015817.ga15...@afflict.kos.to
Re: Please test gzip -9n - related to dpkg with multiarch support
On Thu, Feb 09, 2012 at 02:54:43PM +0100, Goswin von Brederlow wrote: Wouter Verhelst wou...@debian.org writes: On Thu, Feb 09, 2012 at 03:45:50AM +0100, Guillem Jover wrote: While this could benefit the multiarch installations (for which they can easily use --path-exclude), it would use lots more space on single arch installations. Does it really? A quick test tells me that uncompressing every file under /usr/share/doc does indeed increase the size of that directory on my laptop by a factor of approximately two: After running sudo find /usr/share/doc -name '*.gz' -exec gunzip {} \;, the size of that directory is as reported by 'du -s' is 1263220 kibibytes, while it was 757280 before, a difference of 505940. This is on a system with 2524 packages installed, for a grand total of... dpkg-query -W -f '${Installed-Size}\n' | awk '{TOT+=$0} END{print TOT}' 8830371 ... approximately 8.5GiB of installed software. While I agree that adding around 500MiB to that installed size is significant, I wouldn't define it as 'lots more space'. Additionally, it should be possible for dpkg to support compressing at install time for those users who request it, based on a configuration parameter. Note that only a fraction of that would be in MA:same packages. Everything else can stay compressed. Some other test (see other mails in thread) estimated an increase of 60MB. Yes, but I think that's a bad idea. Either we should compress everything, or nothing at all. Compressing some files but not others is going to be confusing, inconsistent, and generally a bad idea. -- The volume of a pizza of thickness a and radius z can be described by the following formula: pi zz a -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120210063939.go3...@grep.be
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012 12:05:44 +1100 Russell Coker russ...@coker.com.au wrote: On Wed, 8 Feb 2012, Riku Voipio riku.voi...@iki.fi wrote: If it turns out not reasonable to expect the compression results to be identical It was reported that sometimes the size differs. Surely if nothing else having gzip sometimes produce an unnecessarily large file is a bug! Expecting that the compression gives the smallest file every time is reasonable. By a single byte - I've not seen file size changes beyond that range. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgp5AJFMdTw76.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012 02:52:52 +0200 Riku Voipio riku.voi...@iki.fi wrote: If it turns out not reasonable to expect the compression results to be identical, we should probably look into using dpkg --path-exclude= with /usr/share/{doc,man,info}/* when installing foreign-architecture packages. That would be a suitable alternative to decompression checksums. It sounds like an implicit Replaces: on the same package of any other architecture for Multi-Arch: same packages limited to /usr/share/{doc,man,info}/*. Very few Multi-Arch: same packages need to install identical compressed files outside these directories. In case it happens, the package needs to use multiarch paths or split files to -common package. It's not that ugly - with a few looks at the list of problematic files. e.g. libslang2-dev, in common with a number of other -dev packages, includes a few example files in the -dev instead of using a -doc package. The compression of those files causes conflicts in the -dev package, at which point creating a -doc package doesn't seem that bad an idea. Other options would be to not compress example files when packaged inside -dev packages - after all, if the example files are large enough for a lack of compression to matter, the examples should be in a -doc package. http://people.debian.org/~jwilk/multi-arch/same-md5sums.txt The ugliness of this solution is that the specialness of /usr/share/doc and others needs to embedded into the package system somewhere. Packages can use multiarch paths for their own files, but there are currently 80 occurrences of changelog.Debian.gz in the list of problematic files. dpkg needs to handle that, packages have no option. I'm wondering if /usr/share/{doc,man,info}/* is the right pattern. Maybe it really is just /usr/share/*. After all, this is how cross/foreign architecture packages have *always* been handled in Debian via dpkg-cross. Nothing in /usr/share/ matters for a cross package created by dpkg-cross (with the possible exception of /usr/share/pkg-config which was always anachronistic). Some template files are added but the package name includes the architecture, so these files are effectively in multiarch paths. There is nothing useful in /usr/share of a Multiarch: same package when installed as foreign architecture package. Emdebian dpkg-cross have proved that by having nothing else until Multi-Arch. Anything you might need is in the native architecture package, so the best thing to do is widen the implicit exclusion to all of /usr/share in the incoming Multi-Arch: same package. In the list, the only listings in the above file which are not in /usr/share do look like bugs: usr/bin/kvirc usr/bin/croco-0.6-config usr/bin/croco-0.6-config usr/include/dspam/auto-config.h usr/include/isl/stdint.h usr/bin/magics-config usr/include/OGRE/OgreBuildSettings.h usr/include/pci/config.h usr/lib/pkgconfig/popt.pc usr/bin/ppl_pl usr/bin/ppl-config usr/include/ppl.hh usr/include/ppl_c.h usr/bin/ppl-config usr/include/ppl.hh usr/include/ppl_c.h usr/lib/sasl2/berkeley_db.txt usr/lib/libwrap.a usr/include/XdmfConfig.h usr/bin/whiptail usr/lib/pkgconfig/popt.pc - needs to be a multiarch path usr/bin/* is just wrong - bug reports invited. usr/include/* means that the package concerned needs to use a multiarch path for that include file(s). That leaves: usr/lib/sasl2/berkeley_db.txt usr/lib/libwrap.a .a files need multiarch paths, clearly. So, apart from /usr/share which I can't see as important for Multi-Arch: same packages, the list of remaining conflicts are bugs and the gzip bug doesn't matter anymore. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgpJi9fyRhNvc.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
* Lars Wirzenius l...@liw.fi [120208 08:58]: On Tue, Feb 07, 2012 at 10:49:23PM +, Ben Hutchings wrote: But it's worse than this: even if dpkg decompresses before comparing, debsums won't (and mustn't, for backward compatibility). So it's potentially necessary to fix up the md5sums file for a package installed for multiple architectures, if it contains a file that was compressed differently. I'm uncomfortable with the idea of checking checksums only for uncompressed data. Compressed files have headers, and at least for some formats, it seems those headers can contain essentially arbitrary data. This allows compressed files to be modified in rather significant ways, without debsums noticing, if debsums uncompresses before comparing. On the other hand most uncompressors silently ignore unexpected data after end of file markers. So the compressed file is even more easily tempered with (especially as debsums only stores md5 without size and md5 does not include the size in the hash like the sha* do. So if one can append arbitrary stuff, it is easy prey). But the point is a bit moot, as debsums is not really useful for security (if you modify files, why not modify the md5sums files, too?). It is useful for reliability, as it checks for files being corrupted by bad discs[1], bad memory[1], bad DMA controlers[1], ... Bernhard R. Link [1] Been there, got bitten, learned to love debsums. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208103337.gb28...@client.brlink.eu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012, Neil Williams codeh...@debian.org wrote: Expecting that the compression gives the smallest file every time is reasonable. By a single byte - I've not seen file size changes beyond that range. It's a matter of principle. A compression program is supposed to reliably compress data. -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201202082154.31137.russ...@coker.com.au
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012 21:54:30 +1100 Russell Coker russ...@coker.com.au wrote: On Wed, 8 Feb 2012, Neil Williams codeh...@debian.org wrote: Expecting that the compression gives the smallest file every time is reasonable. By a single byte - I've not seen file size changes beyond that range. It's a matter of principle. A compression program is supposed to reliably compress data. That doesn't mean to a specific size, the principle of reliable compression means that you always get back the file you put in, without compression causing losses or corruption. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgp9pMVdVDcBr.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/07/2012 11:04 PM, Neil Williams wrote: I'm not convinced that this is fixable at the gzip level, nor is it likely to be fixable by the trauma of changing from gzip to something else. while the original point of not considering compressors that are not producing identical output across all archs in dpkgs multi-arch implementation still stands, it might be worth noting (and at least jftr) that lzip does not suffer from that problem in the first place. - -- Address:Daniel Baumann, Donnerbuehlweg 3, CH-3012 Bern Email: daniel.baum...@progress-technologies.net Internet: http://people.progress-technologies.net/~daniel.baumann/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8yWOYACgkQ+C5cwEsrK57kRACg4LIBEB+Yn6bMC9E+xybh/doX FJAAn2Ufes5sZTMn4XHYQ2fr5ja/w6W6 =qbRU -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f3258e6.9060...@progress-technologies.net
Re: Please test gzip -9n - related to dpkg with multiarch support
On 08/02/12 10:22, Neil Williams wrote: Nothing in /usr/share/ matters for a cross package created by dpkg-cross (with the possible exception of /usr/share/pkg-config which was always anachronistic). I'd understood that /usr/share/pkgconfig should be used for the sort of packages that would now be Multi-Arch:foreign? Looking in my instance of that directory, I only see Architecture:all packages (like gnome-icon-theme and gtk-doc), and Architecture:any packages whose API is in terms of running executables or making D-Bus calls rather than linking libraries (like udev and systemd). udev and systemd both also ship libraries, as it happens, but those libraries have their own .pc files, which are correctly under /usr/lib. If this is a concern, maybe we should have a Lintian check that /usr/share/pkgconfig/*.pc must not have a non-trivial value in their Libs, Cflags or Libs.private fields? (Some arch-independent .pc files do have those fields, but their values are empty, as in gnome-icon-theme - that seems valid.) S -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4f325d41.9030...@debian.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 07:57 +, Lars Wirzenius wrote: On Tue, Feb 07, 2012 at 10:49:23PM +, Ben Hutchings wrote: But it's worse than this: even if dpkg decompresses before comparing, debsums won't (and mustn't, for backward compatibility). So it's potentially necessary to fix up the md5sums file for a package installed for multiple architectures, if it contains a file that was compressed differently. I'm uncomfortable with the idea of checking checksums only for uncompressed data. Compressed files have headers, and at least for some formats, it seems those headers can contain essentially arbitrary data. This allows compressed files to be modified in rather significant ways, without debsums noticing, if debsums uncompresses before comparing. Further, uncompressors have the potential for security problems. See https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-2624 for example. In other words: debsums needs to decompress to verify that no files have been tampered with, but doing so can invoke an attack. Such an attack may be unlikely, but it would seem to be a better design to not open up the possibility for it. I wasn't suggesting debsums would do decompression. Ben. -- Ben Hutchings The generation of random numbers is too important to be left to chance. - Robert Coveyou signature.asc Description: This is a digitally signed message part
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, Feb 07, 2012 at 10:04:04PM +, Neil Williams wrote: Maybe the way to solve this properly is to remove compression from the uniqueness check - compare the contents of the file in memory after decompression. Yes, it will take longer but it is only needed when the md5sum (which already exists) doesn't match. Actually, I think the real way to fix this properly is to not compress files in the package at all. The contents.tar.gz is already a .tar.gz, which means it's compressed. Doubly-compressing files hardly ever nets a benefit, so we're not compressing files for the benefit of our mirrors. The only reason why we compress files in /usr/share/doc is so that that directory doesn't waste too much space. If that is the case, I think it makes much more sense for files to be packaged inside .debs uncompressed, and (optionally) for dpkg to compress them on the fly should the system administrator request it. It would then make much more sense for dpkg to consider the contents of the file, rather than the on-disk representation, and not cause this kind of issues. As an additional benefit, this will also allow those among us (like me) who hate having to use 'gunzip -c /usr/share/doc/foo/bar.pdf.gz /tmp/bar.pdf; xpdf /tmp/bar.pdf' in order to be able to read some documentation, to just request that files are not compressed. -- The volume of a pizza of thickness a and radius z can be described by the following formula: pi zz a -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208131025.ga27...@grep.be
Re: Please test gzip -9n - related to dpkg with multiarch support
Neil Williams codeh...@debian.org (07/02/2012): I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. For those not subscribed to that bug, how to reproduce[1] and possible fix[2] are available now. There might be other places where buffers are reused, I only spent a few minutes on this during my lunch break. 1. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=55;bug=647522 2. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=647522 Mraw, KiBi. signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
(Dropping everyone but dd@.) Wouter Verhelst wou...@debian.org (08/02/2012): As an additional benefit, this will also allow those among us (like me) who hate having to use 'gunzip -c /usr/share/doc/foo/bar.pdf.gz /tmp/bar.pdf; xpdf /tmp/bar.pdf' in order to be able to read some documentation, to just request that files are not compressed. Try: xpdf /usr/share/doc/debian-policy/policy.pdf.gz Mraw, KiBi. signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Russ Allbery writes (Re: Please test gzip -9n - related to dpkg with multiarch support): Another possible solution is to just give any package an implicit Replaces (possibly constrained to /usr/share/doc) on any other package with the same name and version and a different architecture. This isn't as defensive, in that it doesn't catch legitimate bugs where someone has made a mistake and the packages contain different contents, but it also solves the binNMU issue (well, solves; the changelog will randomly swap back and forth between the packages, but I'm having a hard time being convinced this is a huge problem). Well, it does mean that you might be lacking important information because the other changelog wouldn't be present on the system. One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20274.30293.295855.341...@chiark.greenend.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 02:14:22PM +0100, Cyril Brulebois wrote: For those not subscribed to that bug, how to reproduce[1] and possible fix[2] are available now. There might be other places where buffers are reused, I only spent a few minutes on this during my lunch break. 2. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=647522 Even if you ensure a particular build behaves exactly the same on a given architecture, you're merely introducing future problems. gzip's output is likely to change: * on a new version * after a bugfix (including security ones) * on a different architecture * with different optimizations * with a different implementation (like those parallel ones) * possibly with a different moon phase Especially the first is pretty guaranteed to bite: whenever the upstream does a small improvement, binaries in the archive get invalidated until rebuilt with the new gzip. Breaking the ideas for diverting /bin/gzip by pigz is not nice, too. -- // If you believe in so-called intellectual property, please immediately // cease using counterfeit alphabets. Instead, contact the nearest temple // of Amon, whose priests will provide you with scribal services for all // your writing needs, for Reasonable and Non-Discriminatory prices. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208140646.gb25...@angband.pl
Re: Please test gzip -9n - related to dpkg with multiarch support
Wouter Verhelst wrote: On Tue, Feb 07, 2012 at 10:04:04PM +, Neil Williams wrote: Maybe the way to solve this properly is to remove compression from the uniqueness check - compare the contents of the file in memory after decompression. Yes, it will take longer but it is only needed when the md5sum (which already exists) doesn't match. Actually, I think the real way to fix this properly is to not compress files in the package at all. The contents.tar.gz is already a .tar.gz, which means it's compressed. s/contents/data/ Doubly-compressing files hardly ever nets a benefit, so we're not compressing files for the benefit of our mirrors. The only reason why we compress files in /usr/share/doc is so that that directory doesn't waste too much space. If that is the case, I think it makes much more sense for files to be packaged inside .debs uncompressed, and (optionally) for dpkg to compress them on the fly should the system administrator request it. It would then make much more sense for dpkg to consider the contents of the file, rather than the on-disk representation, and not cause this kind of issues. I agree with this entirely. Doing this would actually save *more* space in the .deb files, since it allows gzip (or xz, or whatever compresses the data.tar) to see the contents of multiple files at once. It also allows the administrator to set local policies for compression to cover cases like the one you mentioned below. Those local policies would also allow the use of compression formats other than .xz, as well as deciding to leave files uncompressed due to the use of a filesystem with built-in compression. It wouldn't work in all cases, since sometimes the package requires a compressed file in a certain location, but it should work for just about all files in /usr/share/doc. The only downside that I can see: packages couldn't refer to a particular file under /usr/share/doc/$package/ by path, because those packages wouldn't know how the administrator might choose to compress their files. Given the policy of not depending on files under /usr/share/doc/ to function, at most this will result in manpages and similar referencing paths that then need a .gz or .xz appended, and that doesn't seem like a big deal; people will cope and tools can learn to check for compressed variants. As an additional benefit, this will also allow those among us (like me) who hate having to use 'gunzip -c /usr/share/doc/foo/bar.pdf.gz /tmp/bar.pdf; xpdf /tmp/bar.pdf' in order to be able to read some documentation, to just request that files are not compressed. Try zrun from the moreutils package: zrun xpdf /usr/share/doc/foo/bar.pdf.gz Or use evince, which can handle compressed files directly. - Josh Triplett -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208135032.GA21503@leaf
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012 14:14:22 +0100 Cyril Brulebois k...@debian.org wrote: Neil Williams codeh...@debian.org (07/02/2012): I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. For those not subscribed to that bug, how to reproduce[1] and possible fix[2] are available now. There might be other places where buffers are reused, I only spent a few minutes on this during my lunch break. 1. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=55;bug=647522 2. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=647522 Thanks for taking that one stage on, Cyril. I ran out of time to look at this any further yesterday, I've only just got back to the bug and noticed the hint about multiple files on the command line from Zack Weinberg. It makes sense that with a single file on the command line, the aberrant compressed file never appears when with more than one, it can. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgp66v2fJQSo5.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 11:33:37AM +0100, Bernhard R. Link wrote: On the other hand most uncompressors silently ignore unexpected data after end of file markers. So the compressed file is even more easily tempered with (especially as debsums only stores md5 without size and md5 does not include the size in the hash like the sha* do. So if one can append arbitrary stuff, it is easy prey). This is not true. MD5 and the SHA variants are all Merkle-Damgård constructions, which is what makes them vulnerable to length extension attacks if the compression function is not secure. Merkle-Damgård constructions include the number of bits hashed in the hash. But yes, MD5 is vulnerable to length extension attacks. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012 15:06:46 +0100 Adam Borowski kilob...@angband.pl wrote: On Wed, Feb 08, 2012 at 02:14:22PM +0100, Cyril Brulebois wrote: For those not subscribed to that bug, how to reproduce[1] and possible fix[2] are available now. There might be other places where buffers are reused, I only spent a few minutes on this during my lunch break. 2. http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=60;bug=647522 Even if you ensure a particular build behaves exactly the same on a given architecture, you're merely introducing future problems. gzip's output is likely to change: * on a new version * after a bugfix (including security ones) * on a different architecture * with different optimizations * with a different implementation (like those parallel ones) * possibly with a different moon phase Especially the first is pretty guaranteed to bite: whenever the upstream does a small improvement, binaries in the archive get invalidated until rebuilt with the new gzip. I don't get it. That would only affect packages which were built during the time that a new upload of gzip is made and all the buildd's making that new version available. Now, if there is a binNMU after a new version of gzip is uploaded, yes it is probably wise to rebuild all architectures if the package includes a Multi-Arch: same library. How often does that happen? It doesn't matter for other packages in the archive - it only matters for binary packages of the same Multi-Arch source which can install the same file in the same place from two or more architectures. Binaries in the archive already are completely unaffected by a new gzip - the only collision is between compressed files in the same location under /usr/share/doc and Policy already handles that with the exception of problems inherent to Multi-Arch. Breaking the ideas for diverting /bin/gzip by pigz is not nice, too. However, having said all that, I think that an approach which borrows / inherits from existing dpkg-cross behaviour by simply assuming that anything in /usr/share of a Multi-Arch: same package just doesn't matter for the functionality of the package is much better, much more reliable allowing any collisions to just get silently ignored. It avoids all of the gzip problems and the only remaining collisions can be fixed as bugs. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgpV9cgoGAweg.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: Well, it does mean that you might be lacking important information because the other changelog wouldn't be present on the system. While the implicit Replaces seems the easy way out, it just seems even more fragile than the shared files approach. And while the binNMU changelog issues might seem like a corner case, it's just a symptom of something that's not quite right. And after this was brought up again I started considering that the shared file approach might have been flawed afterall, even if it might have seemed neat at the time (it's one of the reasons that part of the code has not been merged yet). The main reason it was enviaged was to handle the changelog and copyright files and to avoid needing to introduce an additional common package per source, for just those two/three files. As a side remark, I think at least those two are actual package metadata and do belong in the .deb control member [0], and as such in the dpkg database. But that's for discussion on another time, because that would not fix the issue as upstream changelogs do conflict too, for example. http://lists.debian.org/debian-dpkg/2011/09/msg00029.html One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Instead of this, I'd rather see the shared files approach just dropped completely, and /usr/share/doc/ files for “Multi-Arch: same” packages be installed under /usr/share/doc/pkgname:arch/. This would solve all these problems in a clean way for the common case with just the two or three mandated files (changelog, changelog.Debian and copyright), if a package provides lots more files then they should be split anyway into either a libfooN-common libfoo-doc, or similar. And finally this would not be really confusing, given that one of the last interface changes was to make all dpkg output for all “Multi-Arch: same” packages be always arch-qualified. regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208161319.ga24...@gaara.hadrons.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 05:13:20PM +0100, Guillem Jover wrote: On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: Well, it does mean that you might be lacking important information because the other changelog wouldn't be present on the system. While the implicit Replaces seems the easy way out, it just seems even more fragile than the shared files approach. And while the binNMU changelog issues might seem like a corner case, it's just a symptom of something that's not quite right. And after this was brought up again I started considering that the shared file approach might have been flawed afterall, even if it might have seemed neat at the time (it's one of the reasons that part of the code has not been merged yet). The main reason it was enviaged was to handle the changelog and copyright files and to avoid needing to introduce an additional common package per source, for just those two/three files. As a side remark, I think at least those two are actual package metadata and do belong in the .deb control member [0], and as such in the dpkg database. But that's for discussion on another time, because that would not fix the issue as upstream changelogs do conflict too, for example. http://lists.debian.org/debian-dpkg/2011/09/msg00029.html One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Instead of this, I'd rather see the shared files approach just dropped completely, and /usr/share/doc/ files for “Multi-Arch: same” packages be installed under /usr/share/doc/pkgname:arch/. This would solve all these problems in a clean way for the common case with just the two or three mandated files (changelog, changelog.Debian and copyright), if a package provides lots more files then they should be split anyway into either a libfooN-common libfoo-doc, or similar. And finally this would not be really confusing, given that one of the last interface changes was to make all dpkg output for all “Multi-Arch: same” packages be always arch-qualified. If you remove the shared files approach, how do you handle files like lintian overrides, reportbug presubj and scripts, etc. ? Mike -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208162922.ga28...@glandium.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 17:29:22 +0100, Mike Hommey wrote: If you remove the shared files approach, how do you handle files like lintian overrides, reportbug presubj and scripts, etc. ? The same principle that applies to all dpkg output to avoid ambiguity would apply everywhere, whenever there's a “Multi-Arch: same” package name that needs to be unambiguous, it just always gets arch-qualified. The rest would stay as they are. So, at least for lintian and reportbug, given that these file/dir names are package name based they would just get arch-qualified when needed. regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208165611.ga25...@gaara.hadrons.org
Re: Please test gzip -9n - related to dpkg with multiarch support
Guillem Jover writes (Re: Please test gzip -9n - related to dpkg with multiarch support): On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: One thing which no-one yet seems to have suggested is to have multiarch:same packages put the changelog in a filename which is distinct for each architecture. (It wouldn't have to be the triplet; the shorter Debian arch would do.) Perhaps there are obvious reasons (which I have missed) why this is a terrible idea, but it seems to me that it's something we should consider. Instead of this, I'd rather see the shared files approach just dropped completely, and /usr/share/doc/ files for “Multi-Arch: same” packages be installed under /usr/share/doc/pkgname:arch/. Right, that's kind of what I was suggesting although you've generalised it. It doesn't seem like an unreasonable idea to me. Obviously it would mean that some (Debian-specific) software which currently doesn't need to be multiarch-aware would need to be taught about these new directory names. But that seems like a reasonable price to pay for solving the varying compressed shared files problem. Another relevant question is whether there are other files that are shared, and which don't want to move, besides ones in /usr/share/doc. I haven't been following this in detail but if there are then we may need to retain the possibility to have actually-identical shared files. Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20274.43538.832430.612...@chiark.greenend.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
Ian Jackson wrote: Another relevant question is whether there are other files that are shared, and which don't want to move, besides ones in /usr/share/doc. One example is header files in /usr/include, from -dev packages. In the simple examples I've seen, putting them in /usr/include/triplet, works fine. It is always possible to split off a separate Multiarch: foreign -dev-common package if needed in order to save space. Another example is manpages, also in -dev packages. That's more fussy. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208172627.GA6712@burratino
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 05:56:11PM +0100, Guillem Jover wrote: On Wed, 2012-02-08 at 17:29:22 +0100, Mike Hommey wrote: If you remove the shared files approach, how do you handle files like lintian overrides, reportbug presubj and scripts, etc. ? The same principle that applies to all dpkg output to avoid ambiguity would apply everywhere, whenever there's a “Multi-Arch: same” package name that needs to be unambiguous, it just always gets arch-qualified. The rest would stay as they are. That is a major waste of space of having multiple copies of identical files with different arch-qualified names. Is that really better architecture to have multiple copies of identical files on user systems? So, at least for lintian and reportbug, given that these file/dir names are package name based they would just get arch-qualified when needed. Another major frustration your no-shared-files proposal adds, is the need to split the M-A: same libfoo-dev packages to libfoo-dev-common in order to avoid overwriting /usr/include contents and /usr/bin/foo-config binaries. Our packages are already heavily split slowing down Packages.gz downloads and all other apt operations. Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208175241.ga6...@afflict.kos.to
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012 17:13:20 +0100 Guillem Jover guil...@debian.org wrote: On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: Well, it does mean that you might be lacking important information because the other changelog wouldn't be present on the system. Instead of this, I'd rather see the shared files approach just dropped completely, and /usr/share/doc/ files for “Multi-Arch: same” packages be installed under /usr/share/doc/pkgname:arch/. This would solve all I agree with this for /usr/share but I don't think the shared files approach should be dropped entirely - /usr/include is one place where it will be very helpful and appears to be working properly already. these problems in a clean way for the common case with just the two or three mandated files (changelog, changelog.Debian and copyright), if a package provides lots more files then they should be split anyway This works for shared library packages - it is -dev packages which still need the shared files approach. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgpOR6BFNxFXN.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 08 Feb 2012, Neil Williams wrote: I don't get it. That would only affect packages which were built during the time that a new upload of gzip is made and all the buildd's making that new version available. Now, if there is a binNMU after a new version of gzip is uploaded, yes it is probably wise to rebuild all architectures if the package includes a Multi-Arch: same library. How often does that happen? Isn't this something that we can test for in the archive, and require rebuilds for all affected packages before entering testing? [Multi-Arch: same with the same path that have differing md5sums?] Even outside of the gzip case, this would catch cases where maintainers had screwed up. Don Armstrong -- The sheer ponderousness of the panel's opinion [...] refutes its thesis far more convincingly than anything I might say. The panel's labored effort to smother the Second Amendment by sheer body weight has all the grace of a sumo wrestler trying to kill a rattlesnake by sitting on it---and is just as likely to succeed. -- Alex Kozinski, Dissenting in Silveira v. Lockyer (CV-00-00411-WBS p5983-4) http://www.donarmstrong.com http://rzlab.ucr.edu -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208192528.gd6...@rzlab.ucr.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
Guillem Jover guil...@debian.org writes: On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: Well, it does mean that you might be lacking important information because the other changelog wouldn't be present on the system. While the implicit Replaces seems the easy way out, it just seems even more fragile than the shared files approach. Yes, this is definitely true. I was mentioning it as an easy way out, but it's aesthetically unappealing. And while the binNMU changelog issues might seem like a corner case, it's just a symptom of something that's not quite right. Also true. In fact, it's something that's been bothering me for a long time with linked doc directories. I'd like to prohibit them in more cases so that we get the binNMU changelogs on disk. Instead of this, I'd rather see the shared files approach just dropped completely, and /usr/share/doc/ files for “Multi-Arch: same” packages be installed under /usr/share/doc/pkgname:arch/. This would solve all these problems in a clean way for the common case with just the two or three mandated files (changelog, changelog.Debian and copyright), if a package provides lots more files then they should be split anyway into either a libfooN-common libfoo-doc, or similar. And finally this would not be really confusing, given that one of the last interface changes was to make all dpkg output for all “Multi-Arch: same” packages be always arch-qualified. The only thing I'm worried about here is that we lose something from the UI perspective. That's going to be a change historically from where we've told users to look, and it's a little awkward. But, thinking it over, the set of packages that we're talking about is fairly limited. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87liodrttk@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 11:47:19AM -0800, Russ Allbery wrote: Guillem Jover guil...@debian.org writes: On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: And while the binNMU changelog issues might seem like a corner case, it's just a symptom of something that's not quite right. Also true. In fact, it's something that's been bothering me for a long time with linked doc directories. I'd like to prohibit them in more cases so that we get the binNMU changelogs on disk. Relating to binNMU changelogs: do they really serve any purpose? There are no source changes, so is there any real need for a changelog change at all? AFAICT the only reason we do for historical reasons, it being the only way previously to effect a version change. We briefly discussed on #debian-buildd last week whether it was possible to use --changes-option to override the distribution during building. If it is also possible to override the version of the generated .debs, this would make it possible to rebuild (NMU) without messing around editing the changelog. Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `-GPG Public Key: 0x25BFB848 Please GPG sign your mail. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208195447.gi8...@codelibre.net
Re: Please test gzip -9n - related to dpkg with multiarch support
Riku Voipio riku.voi...@iki.fi writes: On Wed, Feb 08, 2012 at 05:56:11PM +0100, Guillem Jover wrote: The same principle that applies to all dpkg output to avoid ambiguity would apply everywhere, whenever there's a “Multi-Arch: same” package name that needs to be unambiguous, it just always gets arch-qualified. The rest would stay as they are. That is a major waste of space of having multiple copies of identical files with different arch-qualified names. Is that really better architecture to have multiple copies of identical files on user systems? Is it really, though? The files we're talking about are not generally large. I have a hard time seeing a case where the files would be large enough to cause any noticable issue and you wouldn't want to move them into a separate -common or -doc package anyway. Another major frustration your no-shared-files proposal adds, is the need to split the M-A: same libfoo-dev packages to libfoo-dev-common in order to avoid overwriting /usr/include contents and /usr/bin/foo-config binaries. There are two main cases for libfoo-dev that I think cover most such packages: 1. The header files are architecture-dependent (definitions of data member sizes, for example), in which case they need to be arch-qualified anyway if you're going to allow multiple libfoo-dev packages to be installed for different architectures. 2. The header files are architecture-independent, and the only architecture-dependent content inside libfoo-dev is the static library. In this case, if you want to make libfoo-dev multi-arch, I would advocate seriously considering just dropping the static library and making the -dev package arch: all. I think static libraries are increasingly of very questionable utility on a Linux system. But, assuming that you do want to keep it, you could still move the header files to /usr/include/triplet instead, which is relatively painless. foo-config binaries, as opposed to pkg-config files, are indeed going to continue to be a problem in model 2, but they're a problem anyway, no? There's no guarantee that the amd64 and i386 version of a package will want the same flags, so we really need some way of having a multiarch-aware verson of the -config script. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87haz1rtex@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 11:25:28 -0800, Don Armstrong wrote: On Wed, 08 Feb 2012, Neil Williams wrote: I don't get it. That would only affect packages which were built during the time that a new upload of gzip is made and all the buildd's making that new version available. Now, if there is a binNMU after a new version of gzip is uploaded, yes it is probably wise to rebuild all architectures if the package includes a Multi-Arch: same library. How often does that happen? Isn't this something that we can test for in the archive, and require rebuilds for all affected packages before entering testing? [Multi-Arch: same with the same path that have differing md5sums?] This has for example the following implication: Let's assume compressor (gzip/bzip2/xz/etc) version M gets uploaded to sid generating a reproducible output across all current architectures. Time passes, compressor version N (and even O and P and Q etc) gets uploaded, which starts producing new ouput (on each of those versions). A new architecture gets added to Debian, and because previous compressor versions are not in the archive anymore, all packages built with them have different checksums than the new ones. This means *all* those packages have to be binNMUed across *all* the architectures, or the porters need to hunt down every specific compressor version used to build those packages to be able to reproduce the build on their arch. This seems highly suboptimal and “future-unproof”... regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208200243.ga29...@gaara.hadrons.org
Re: Please test gzip -9n - related to dpkg with multiarch support
Guillem Jover guil...@debian.org writes: Let's assume compressor (gzip/bzip2/xz/etc) version M gets uploaded to sid generating a reproducible output across all current architectures. Time passes, compressor version N (and even O and P and Q etc) gets uploaded, which starts producing new ouput (on each of those versions). A new architecture gets added to Debian, and because previous compressor versions are not in the archive anymore, all packages built with them have different checksums than the new ones. This means *all* those packages have to be binNMUed across *all* the architectures, or the porters need to hunt down every specific compressor version used to build those packages to be able to reproduce the build on their arch. This seems highly suboptimal and “future-unproof”... Yes, agreed. I think this is a rather compelling argument against relying on reproducibility of compressor output. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87wr7xqdys@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
Roger Leigh rle...@codelibre.net writes: On Wed, Feb 08, 2012 at 11:47:19AM -0800, Russ Allbery wrote: Also true. In fact, it's something that's been bothering me for a long time with linked doc directories. I'd like to prohibit them in more cases so that we get the binNMU changelogs on disk. Relating to binNMU changelogs: do they really serve any purpose? There are no source changes, so is there any real need for a changelog change at all? AFAICT the only reason we do for historical reasons, it being the only way previously to effect a version change. It seems weird not to have a record of *why* the package was rebuilt somewhere, but I guess I can't think of any reason why I would care off the top of my head. In the long run, I really like Guillem's point about making the changelog dpkg metadata. We could still put it somewhere on disk by default, but it would make things like this much cleaner conceptually, or at least it feels like it would on first glance. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/874nv1rsko@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 08 Feb 2012 11:56:06 -0800 Russ Allbery r...@debian.org wrote: There are two main cases for libfoo-dev that I think cover most such packages: 1. The header files are architecture-dependent (definitions of data member sizes, for example), in which case they need to be arch-qualified anyway if you're going to allow multiple libfoo-dev packages to be installed for different architectures. 2. The header files are architecture-independent, and the only architecture-dependent content inside libfoo-dev is the static library. So the symlink would have to move to the shared library alongside the other symlink? -dev: ./usr/lib/x86_64-linux-gnu/libgpelaunch.so - libgpelaunch.so.0.0.0 lib: ./usr/lib/x86_64-linux-gnu/libgpelaunch.so.0 - libgpelaunch.so.0.0.0 That's going to need a Replaces: in the lib against the -dev. pkg-config files are also Multi-Arch sensitive: libdir=${prefix}/lib/x86_64-linux-gnu Those need to be in Multi-Arch paths: ./usr/lib/x86_64-linux-gnu/pkgconfig/libgpelaunch.pc In this case, if you want to make libfoo-dev multi-arch, I would advocate seriously considering just dropping the static library and making the -dev package arch: all. I think static libraries are increasingly of very questionable utility on a Linux system. But, I would drop the .a but that doesn't mean I can make the -dev package Multi-Arch: foreign. foo-config binaries, as opposed to pkg-config files, are indeed going to continue to be a problem in model 2, but they're a problem anyway, no? ... yes... There's no guarantee that the amd64 and i386 version of a package will want the same flags, so we really need some way of having a multiarch-aware verson of the -config script. It's not just amd64|i386, Multi-Arch - to me and probably Riku - is about amd64|armel etc. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgpbzC5ypYoUA.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Joey Hess jo...@debian.org writes: pristine-tar hat tricks[1] aside, none of gzip, bzip2, xz are required to always produce the same compressed file for a given input file, and I can tell you from experience that there is a wide amount of variation. If multiarch requires this, then its design is at worst broken, and at best, there will be a lot of coordination pain every time there is a new/different version of any of these that happens to compress slightly differently. Maybe there was a reason for Guillem to want to tread carefully. I wanted to come back to this and make a quick comment on it since it's an important point. I completely agree there were very good reasons for Guillem to want to tread carefully. But I think that having a public alpha test and to have other people poking at it is critical, because otherwise that discussion has a tendency to only happen in the head of a few developers, with everyone else tending to assume there aren't problems. We collectively have a lot of wisdom and a lot of experience with edge cases, and I think we needed the project in general to get involved in this discussion. And in practice that usually doesn't happen until there's a thing that people can use and encounter problems with, and that feels like it might become the future unless people object or report bugs. And it also makes it clear what those problems are, so that people who are anxious to see the new features can see publicly what has to be dealt with first before they can be made available. My feeling is that this discussion is exactly the sort of outcome that I was hoping for from a public alpha test. We found a problem that wasn't completely thought-through, and now we're discussing it. It wasn't the *ideal* outcome, which would have been that everything was great and we could move forward into our happy Multi-Arch future, but it's an important and useful outcome. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87sjilqdrg@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 07:54:47PM +, Roger Leigh wrote: On Wed, Feb 08, 2012 at 11:47:19AM -0800, Russ Allbery wrote: Guillem Jover guil...@debian.org writes: On Wed, 2012-02-08 at 13:19:17 +, Ian Jackson wrote: And while the binNMU changelog issues might seem like a corner case, it's just a symptom of something that's not quite right. Also true. In fact, it's something that's been bothering me for a long time with linked doc directories. I'd like to prohibit them in more cases so that we get the binNMU changelogs on disk. Relating to binNMU changelogs: do they really serve any purpose? There are no source changes, so is there any real need for a changelog change at all? AFAICT the only reason we do for historical reasons, it being the only way previously to effect a version change. We briefly discussed on #debian-buildd last week whether it was possible to use --changes-option to override the distribution during building. If it is also possible to override the version of the generated .debs, this would make it possible to rebuild (NMU) without messing around editing the changelog. There are some source packages that always generate binary versions different from the source version, e.g. linux-latest. They must currently use dpkg-parsechangelog to get the default binary version. If the binNMU is not mentioned in the changelog they would get the source version and would not include any binNMU suffix in the generated binary versions. We could make dpkg-parsechangelog obey a version override too, but it's rather ugly! (Hmm, it looks like linux-latest runs dpkg-parsechangelog at source package build time so it's not binNMU-safe now. But this is fixable so long as dpkg-parsechangelog makes binNMUs recognisable.) Ben. -- Ben Hutchings We get into the habit of living before acquiring the habit of thinking. - Albert Camus -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208202317.gd12...@decadent.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 11:56:06 -0800, Russ Allbery wrote: Riku Voipio riku.voi...@iki.fi writes: That is a major waste of space of having multiple copies of identical files with different arch-qualified names. Is that really better architecture to have multiple copies of identical files on user systems? Is it really, though? The files we're talking about are not generally large. I have a hard time seeing a case where the files would be large enough to cause any noticable issue and you wouldn't want to move them into a separate -common or -doc package anyway. Exactly, in addition this is already an “issue” with lots of packages (regardless of multi-arch) which do not use a common symlinked doc dir. These are some numbers I'm getting on my system (w/ the attached quickly hacked up script), all wild approximations, just to get a feel of it: Approx. installed m-a:same lib waste (w/o -dev,-doc): 20051501 Approx. installed m-a:same lib waste (w/ -dev,-doc): 23310229 Approx. installed m-a:same lib waste per package (23310229 / 293): 79557.09 Approx. predicted lib waste per arch (779 * 79557.09): 61974973.11 Approx. total lib waste per arch (4003 * 79557.09): 318467031.27 So, supposedly, if all possible libs were to be multiarchified I'd waste 60 MiB in case I wanted to have all of them installed for each architecture I enable. Which is not going to be the case. But if it was and 60 MiB were such a problem I could just as well use «dpkg --exclude-path» support. Also I think there's problably some room for improvement which would benefit non-multiarch installations too. For example TODO, USAGE and lots of similar files should be moved to the -dev packages. AUTHORS THANKS and CREDITS files should probably be already represented in copyright, etc. Provably a lintian warning could be introduced for this. regards, guillem #!/bin/sh echo List of files that might be candidates to be split out grep-status -n -sPackage -FMulti-Arch same | \ egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \ egrep -v '(copyright|changelog|NEWS|README)' | \ while read f; do test -f $f printf $f\0; done | \ du -bsch --files0-from - waste_libs=$(grep-status -n -sPackage -FMulti-Arch same | \ egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \ while read f; do test -f $f printf $f\0; done | \ du -bc --files0-from - | tail -n 1 | cut -f1) echo Approx. installed m-a:same lib waste (w/o -dev,-doc): $waste_libs waste_same=$(grep-status -n -sPackage -FMulti-Arch same | \ xargs dpkg -L | grep '\/usr\/share\/' | \ while read f; do test -f $f printf $f\0; done | \ du -bc --files0-from - | tail -n 1 | cut -f1) echo Approx. installed m-a:same lib waste (w/ -dev,-doc): $waste_same inst_same=$(grep-status -n -sPackage -FMulti-Arch same|wc -l) waste_per_lib=$(echo scale=2; $waste_same / $inst_same | bc -l) echo Approx. installed m-a:same lib waste per package ($waste_same / $inst_same): $waste_per_lib inst_libs=$(grep-status -n -r -sPackage -FSection libs| \ egrep -v '(common|data|-bin)'| wc -l) waste_inst=$(echo scale=2; $inst_libs * $waste_per_lib | bc -l) echo Approx. predicted lib waste per arch ($inst_libs * $waste_per_lib): $waste_inst total_libs=$(grep-aptavail -n -r -sPackage -FSection libs| \ egrep -v '(common|data|-bin)'| wc -l) waste_total=$(echo scale=2; $total_libs * $waste_per_lib | bc -l) echo Approx. total lib waste per arch ($total_libs * $waste_per_lib): $waste_total
Re: Please test gzip -9n - related to dpkg with multiarch support
* Ben Hutchings b...@decadent.org.uk, 2012-02-08, 20:23: Relating to binNMU changelogs: do they really serve any purpose? There are no source changes, so is there any real need for a changelog change at all? AFAICT the only reason we do for historical reasons, it being the only way previously to effect a version change. We briefly discussed on #debian-buildd last week whether it was possible to use --changes-option to override the distribution during building. If it is also possible to override the version of the generated .debs, this would make it possible to rebuild (NMU) without messing around editing the changelog. There are some source packages that always generate binary versions different from the source version, e.g. linux-latest. They must currently use dpkg-parsechangelog to get the default binary version. If the binNMU is not mentioned in the changelog they would get the source version and would not include any binNMU suffix in the generated binary versions. We could make dpkg-parsechangelog obey a version override too, but it's rather ugly! (Hmm, it looks like linux-latest runs dpkg-parsechangelog at source package build time so it's not binNMU-safe now. But this is fixable so long as dpkg-parsechangelog makes binNMUs recognisable.) Right. What we want to change is not how debian/changelog is being modified by build software, but what happens to it when it lands in /usr/share/doc/$pkgname/. Packages could simply split debian/changelog into two parts: /u/s/d/$p/changelog(.Debian).gz - the architecture-independent part; /u/s/d/$p/changelog.binNMU-$arch.gz - the binNMU part. That would have to be implemented in dh_installchangelogs and a few M-A:same that don't use debhelper. -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208203432.ga4...@jwilk.net
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 09:02:43PM +0100, Guillem Jover wrote: Let's assume compressor (gzip/bzip2/xz/etc) version M gets uploaded to sid generating a reproducible output across all current architectures. Time passes, compressor version N (and even O and P and Q etc) gets uploaded, which starts producing new ouput (on each of those versions). A new architecture gets added to Debian, and because previous compressor versions are not in the archive anymore, all packages built with them have different checksums than the new ones. This means *all* those packages have to be binNMUed across *all* the architectures, or the porters need to hunt down every specific compressor version used to build those packages to be able to reproduce the build on their arch. You are making assumption that compressor versions M, N, O, P and Q happen during a timeframe shorter than libraries are uploaded/binNMU'd in debian. Gzip development is glacial. Last upstream version changing uploads in Debian were 1.3.12 in 2007 and 1.4 in 2011. Most changes in gzip code shouldn't even affect the output of gzip -n9, as they are mostly fixes to make gzip build in world changing around it. New architectures don't happen every day, while library maintaincene does. Throwing away the Allow identical files on Multi-Arch: same convinience for library maintainers to make including new architectures a bit easier is a bit daft tradeoff. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208205724.ga7...@afflict.kos.to
Re: Please test gzip -9n - related to dpkg with multiarch support
* Guillem Jover guil...@debian.org, 2012-02-08, 21:02: Let's assume compressor (gzip/bzip2/xz/etc) version M gets uploaded to sid generating a reproducible output across all current architectures. Time passes, compressor version N (and even O and P and Q etc) gets uploaded, which starts producing new ouput (on each of those versions). A new architecture gets added to Debian, and because previous compressor versions are not in the archive anymore, all packages built with them have different checksums than the new ones. This means *all* those packages have to be binNMUed across *all* the architectures, or the porters need to hunt down every specific compressor version used to build those packages to be able to reproduce the build on their arch. In practice, the only compressor we need to care is gzip, which is not actively maintained upstream[0]. Chances that a new version of it will break large number of packages are minute. But anyway, I believe that in the long run we should simply deprecate compressing stuff in /usr/share/doc/. [0] From http://lists.gnu.org/archive/html/bug-gzip/2010-02/msg00044.html: [...] gzip is in maintenance-only mode. [...] I am working on gzip solely to fix bugs and maintain a certain level of portability and robustness. -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208210123.ga5...@jwilk.net
Re: Please test gzip -9n - related to dpkg with multiarch support
On 2012-02-08, Russ Allbery r...@debian.org wrote: There are two main cases for libfoo-dev that I think cover most such packages: 3) to ensure that things can keep working on slow archs while they build a new edition of src:foo /Sune -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/slrnjj5q48.p7v.nos...@sshway.ssh.pusling.com
Re: Please test gzip -9n - related to dpkg with multiarch support
Neil Williams codeh...@debian.org writes: Russ Allbery r...@debian.org wrote: There are two main cases for libfoo-dev that I think cover most such packages: 1. The header files are architecture-dependent (definitions of data member sizes, for example), in which case they need to be arch-qualified anyway if you're going to allow multiple libfoo-dev packages to be installed for different architectures. 2. The header files are architecture-independent, and the only architecture-dependent content inside libfoo-dev is the static library. So the symlink would have to move to the shared library alongside the other symlink? -dev: ./usr/lib/x86_64-linux-gnu/libgpelaunch.so - libgpelaunch.so.0.0.0 lib: ./usr/lib/x86_64-linux-gnu/libgpelaunch.so.0 - libgpelaunch.so.0.0.0 Oh, good point, I'd forgotten that for multiarch the symlink is architecture-dependent. So yes, the -dev package is inherently architecture-dependent. We can't move the symlink to the shared library package because then the shared library packages aren't co-installable. pkg-config files are also Multi-Arch sensitive: libdir=${prefix}/lib/x86_64-linux-gnu Those need to be in Multi-Arch paths: ./usr/lib/x86_64-linux-gnu/pkgconfig/libgpelaunch.pc Correct. Anyone converting a library to multiarch should already be moving them, IMO. I have with all of mine. I would drop the .a but that doesn't mean I can make the -dev package Multi-Arch: foreign. You're right. -dev packages are going to have to be multi-arch: same. I think the hardest problem is then going to be the documentation (including man pages) that are normally now in the -dev package, and any -config scripts or the like. We already have multiarch path solutions for header files and for the symlinks, although it requires duplicating the header files for each architecture. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87d39pq9ph@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 11:56:06AM -0800, Russ Allbery wrote: Riku Voipio riku.voi...@iki.fi writes: On Wed, Feb 08, 2012 at 05:56:11PM +0100, Guillem Jover wrote: The same principle that applies to all dpkg output to avoid ambiguity would apply everywhere, whenever there's a “Multi-Arch: same” package name that needs to be unambiguous, it just always gets arch-qualified. The rest would stay as they are. That is a major waste of space of having multiple copies of identical files with different arch-qualified names. Is that really better architecture to have multiple copies of identical files on user systems? Is it really, though? The files we're talking about are not generally large. I have a hard time seeing a case where the files would be large enough to cause any noticable issue and you wouldn't want to move them into a separate -common or -doc package anyway. So I had a look at the Ubuntu archive, which already has a large collection of packages converted to Multi-Arch: same, to provide some hard facts for this discussion. - 1219 binary packages are marked Multi-Arch: same - 2197 files are shipped in /usr/share by these packages, outside of /usr/share/doc - which, by and large, are files that can actually be shared between architectures. - These files are distributed between 47 different subdirectories: 703 ./usr/share/man 604 ./usr/share/ada 187 ./usr/share/lintian 185 ./usr/share/locale 93 ./usr/share/alsa 70 ./usr/share/gtk-doc 53 ./usr/share/bug 36 ./usr/share/qt4 35 ./usr/share/libtool 22 ./usr/share/themes 16 ./usr/share/lua 16 ./usr/share/libphone-ui-shr 15 ./usr/share/aclocal 14 ./usr/share/icons 11 ./usr/share/pam-configs 11 ./usr/share/info 10 ./usr/share/vala 9 ./usr/share/gtk-engines 8 ./usr/share/qalculate 8 ./usr/share/OGRE 7 ./usr/share/xml 7 ./usr/share/libwacom 7 ./usr/share/gir-1.0 7 ./usr/share/dbconfig-common 6 ./usr/share/mupen64plus 6 ./usr/share/libgphoto2 5 ./usr/share/pixmaps 5 ./usr/share/openchange 4 ./usr/share/mime-info 4 ./usr/share/menu 4 ./usr/share/libofx4 4 ./usr/share/gstreamer-0.10 3 ./usr/share/java 3 ./usr/share/gconf 3 ./usr/share/gcc-4.6 3 ./usr/share/applications 2 ./usr/share/guile 2 ./usr/share/application-registry 1 ./usr/share/tdsodbc 1 ./usr/share/psqlodbc 1 ./usr/share/pascal 1 ./usr/share/libpam-ldap 1 ./usr/share/libmyodbc 1 ./usr/share/libaudio2 1 ./usr/share/kde4 1 ./usr/share/gst-plugins-base 1 ./usr/share/avahi - For many of these files, it would be actively harmful to use architecture-qualified filenames. Manpages included in -dev packages should not change names based on the architecture; having /usr/share/pam-config contain multiple files for the same profile, one for each architecture of the package that's installed, would not work correctly; etc. - If we needed to split the arch-indep contents out of the M-A: same package instead of reference counting in dpkg, that would be roughly 170 new binary packages. 139 of them would contain 10 files or less (exclusive of /usr/share/doc). I think there are pretty solid benefits to proceeding with a dpkg that allows sharing files across M-A: same packages. Even if we decided we couldn't rely on gzip, there are still lots of other cases where this matters. And besides, consider that a M-A: same package shipping contents in a non-architecture-qualified path that vary by architecture is *always* a bug in that package, which will need to be fixed. Requiring that M-A: same packages don't use non-architecture-qualified paths even for files which *don't* vary by architecture doesn't help much to ensure that we won't have bugs. It would be easier for lintian to spot errors in M-A: same packages if we can say that any file that doesn't have an architecture-qualified path is buggy, but at this point we already have Jakub's reports anyway, which we could make a regular part of our archive consistency checks. So I don't believe that having dpkg be more strict about files that *could* be shared will make the user experience any better; it just presents more occasions for packages to be regarded as buggy and for dpkg to error out. foo-config binaries, as opposed to pkg-config files, are indeed going to continue to be a problem in model 2, but they're a problem anyway, no? Yes, they definitely are. There's no guarantee that the amd64 and i386 version of a package will want the same flags, so we really need some way of having a multiarch-aware verson of the -config script. Preferably by s/foo/pkg/. pkgconfig gets this right, the standalone tools all get it wrong, there's no good reason not to just replace them with pkgconfig. -- Steve Langasek Give me a lever long
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 10:22:17AM +, Neil Williams wrote: After all, this is how cross/foreign architecture packages have *always* been handled in Debian via dpkg-cross. Nothing in /usr/share/ matters for a cross package created by dpkg-cross (with the possible exception of /usr/share/pkg-config which was always anachronistic). Some template files are added but the package name includes the architecture, so these files are effectively in multiarch paths. There is nothing useful in /usr/share of a Multiarch: same package when installed as foreign architecture package. Emdebian dpkg-cross have proved that by having nothing else until Multi-Arch. Anything you might need is in the native architecture package, so the best thing to do is widen the implicit exclusion to all of /usr/share in the incoming Multi-Arch: same package. The unfounded assumption here is that you will always install a foreign-arch M-A: same package together with the native-arch version. If I install libaudio2:i386 because I want to play a game that's only available as a 32-bit binary and has this lib as a dependency, and nothing else on my system uses libaudio2, I still expect to get /usr/share/libaudio2/AuErrorDB installed. In general, anything that introduces assymetric handling between native and foreign arch packages at the dpkg level is probably going to be a bad idea. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Le Wed, Feb 08, 2012 at 07:54:47PM +, Roger Leigh a écrit : Relating to binNMU changelogs: do they really serve any purpose? There are no source changes, so is there any real need for a changelog change at all? AFAICT the only reason we do for historical reasons, it being the only way previously to effect a version change. Hi, I think that it is important for maintainers that who decided the binNMU and the reason for it are recorded in the changelog, because often these changes are not triggred or coordinated by the maintainers themselves. Have a nice day, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209005416.gb17...@falafel.plessy.net
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek vor...@debian.org writes: The unfounded assumption here is that you will always install a foreign-arch M-A: same package together with the native-arch version. If I install libaudio2:i386 because I want to play a game that's only available as a 32-bit binary and has this lib as a dependency, and nothing else on my system uses libaudio2, I still expect to get /usr/share/libaudio2/AuErrorDB installed. How is that not a serious policy violation already? AuErrorDB isn't versioned with the SONAME, so libaudio2 and libaudio3 would not be coinstallable. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/874nv0n7vd@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, Feb 08, 2012 at 04:55:02PM -0800, Russ Allbery wrote: Steve Langasek vor...@debian.org writes: The unfounded assumption here is that you will always install a foreign-arch M-A: same package together with the native-arch version. If I install libaudio2:i386 because I want to play a game that's only available as a 32-bit binary and has this lib as a dependency, and nothing else on my system uses libaudio2, I still expect to get /usr/share/libaudio2/AuErrorDB installed. How is that not a serious policy violation already? AuErrorDB isn't versioned with the SONAME, so libaudio2 and libaudio3 would not be coinstallable. Because libaudio2 is in the directory name. Also, it's not a policy violation for a library package to contain files that don't have sensibly versioned names; it's only a policy violation for the name to not change on soname bump. So even if this were called /usr/share/AuErrorDB, it could be changed to /usr/share/libaudio3/AuErrorDB on soname change and still be compliant. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Steve Langasek vor...@debian.org writes: On Wed, Feb 08, 2012 at 04:55:02PM -0800, Russ Allbery wrote: Steve Langasek vor...@debian.org writes: The unfounded assumption here is that you will always install a foreign-arch M-A: same package together with the native-arch version. If I install libaudio2:i386 because I want to play a game that's only available as a 32-bit binary and has this lib as a dependency, and nothing else on my system uses libaudio2, I still expect to get /usr/share/libaudio2/AuErrorDB installed. How is that not a serious policy violation already? AuErrorDB isn't versioned with the SONAME, so libaudio2 and libaudio3 would not be coinstallable. Because libaudio2 is in the directory name. Oh, duh. Sorry, I'm just blind. Also, it's not a policy violation for a library package to contain files that don't have sensibly versioned names; it's only a policy violation for the name to not change on soname bump. So even if this were called /usr/share/AuErrorDB, it could be changed to /usr/share/libaudio3/AuErrorDB on soname change and still be compliant. Good point. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87ty30iz9a@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
+++ Russ Allbery [2012-02-08 13:47 -0800]: Neil Williams codeh...@debian.org writes: Russ Allbery r...@debian.org wrote: Oh, good point, I'd forgotten that for multiarch the symlink is architecture-dependent. So yes, the -dev package is inherently architecture-dependent. I would drop the .a but that doesn't mean I can make the -dev package Multi-Arch: foreign. You're right. -dev packages are going to have to be multi-arch: same. I think the hardest problem is then going to be the documentation (including man pages) that are normally now in the -dev package, and any -config scripts or the like. We already have multiarch path solutions for header files and for the symlinks, although it requires duplicating the header files for each architecture. This part of the thread is getting into the general problem of what exactly the spec and guidelines to packagers should be for multi-arch -dev packages. We have purposely concentrated so far on the library packages in the detailed spec and left the -dev packaging details a bit vague to see exactly what the issues were. (and it is rather less important for -dev packages to actually be co-installable, because you can get by by just installing one or the other for cross- building purposes, although co-installability is definately desireable IMHO). I was thinking of starting a thread on this anyway soon, but as we are now discussing it anyway it might be a good time to go over the various issues. Some of the issues are already clear I think (moving arch-dependent headers into arch-qualified dirs, but leaving the others where they are), but the docs haven't cught up, and there are some trickier bits (like foo-config files where upstream don't want to move to pkg-config, and to what degree it is worthwhile making all -dev packages co-installable), that need some discusion and packages that probably just want splitting up (ones with a lot of binary utilities in). I'll start a new thread with some doc pointers and list of issues. Wookey -- Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM http://wookware.org/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209013357.gh14...@dream.aleph1.co.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 15:14:35 -0800, Steve Langasek wrote: So I had a look at the Ubuntu archive, which already has a large collection of packages converted to Multi-Arch: same, to provide some hard facts for this discussion. - 2197 files are shipped in /usr/share by these packages, outside of /usr/share/doc - which, by and large, are files that can actually be shared between architectures. - These files are distributed between 47 different subdirectories: 703 ./usr/share/man 11 ./usr/share/info 3 ./usr/share/java These three are always compressed so would need to be split anyway. 187 ./usr/share/lintian 53 ./usr/share/bug 4 ./usr/share/mime-info 4 ./usr/share/menu 3 ./usr/share/applications These should usually be pkgname based, thus can be just kept arch-qualified. I've not checked the rest in detail, but just with these, the 2197 files get reduced to 1229 which might not need moving out otherwise. - For many of these files, it would be actively harmful to use architecture-qualified filenames. Manpages included in -dev packages should not change names based on the architecture; having /usr/share/pam-config contain multiple files for the same profile, one for each architecture of the package that's installed, would not work correctly; etc. I said that arch-qualifying should apply for things that are currently pkgname based, but never that this should be used to avoid any file conflict, for the rest the correct solution would be to just split them out. - If we needed to split the arch-indep contents out of the M-A: same package instead of reference counting in dpkg, that would be roughly 170 new binary packages. 139 of them would contain 10 files or less (exclusive of /usr/share/doc). Given that several of those would need to be created regardless due to the many compressed files above, and several others do not need to be split at all, the resulting number of packages does not seem onerous to me at all, it actually seems like the right thing to do, after all. Riku mentioned as an argument that this increases the data to download due to slightly bigger Packages files, but pdiffs were introduced exactly to fix that problem. And, as long as the packages do not get updated one should not get pdiff updates. And with the splitting of Description there's even less data to download now. I think there are pretty solid benefits to proceeding with a dpkg that allows sharing files across M-A: same packages. Even if we decided we couldn't rely on gzip, there are still lots of other cases where this matters. While there's obviously some benefits, otherwise we'd not have considered shared files an option at all, I don't think they outweigh at all the problems and fragility they introduce. And besides, consider that a M-A: same package shipping contents in a non-architecture-qualified path that vary by architecture is *always* a bug in that package, which will need to be fixed. Requiring that M-A: same packages don't use non-architecture-qualified paths even for files which *don't* vary by architecture doesn't help much to ensure that we won't have bugs. It would be easier for lintian to spot errors in M-A: same packages if we can say that any file that doesn't have an architecture-qualified path is buggy, but at this point we already have Jakub's reports anyway, which we could make a regular part of our archive consistency checks. So I don't believe that having dpkg be more strict about files that *could* be shared will make the user experience any better; it just presents more occasions for packages to be regarded as buggy and for dpkg to error out. W/o automatic checks or actual installation testing any such issues can be introduced, this is not specific to M-A: same packages, we do have similar problems when moving files around two packages, or when stomping over other package namespaces, etc. Adding shared file support into dpkg, introduces additional uneeded complexity, that can never be taken out, and which seems clear to me should be dealt with at the package level instead. regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209023428.ga6...@gaara.hadrons.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 22:01:23 +0100, Jakub Wilk wrote: In practice, the only compressor we need to care is gzip, which is not actively maintained upstream[0]. Chances that a new version of it will break large number of packages are minute. That assumes that we will never want to switch to a better/faster compressor for any gzip compressed file. Or that there's no existing files compressed with anything other than gzip. But anyway, I believe that in the long run we should simply deprecate compressing stuff in /usr/share/doc/. So the main reason people are arguing for shared files boils down to used size, either in installed files, or Packages files, etc, yet you want to fix the compression issue by not compressing them and using even more space? While this could benefit the multiarch installations (for which they can easily use --path-exclude), it would use lots more space on single arch installations. Also splitting files into new arch:all packages should usually reduce archive size usage for example. regards, guillem -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209024549.gb6...@gaara.hadrons.org
Re: Please test gzip -9n - related to dpkg with multiarch support
On Feb 09, Steve Langasek vor...@debian.org wrote: I think there are pretty solid benefits to proceeding with a dpkg that allows sharing files across M-A: same packages. Agreed. Fix the tools instead of breaking the standard to adapt to broken tools. Myself, I like the idea of the implicit Replaces. -- ciao, Marco signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Mon, 6 Feb 2012 08:31:15 +0100 Raphael Hertzog hert...@debian.org wrote: If you discover any bug in dpkg's multiarch implementation, please report it to the BTS (against the version 1.16.2~wipmultiarch). I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. Some MultiArch: same packages in the archive (libppl9 is the one I came across first) contain .gz files in ./usr/share/doc/ which differ between architectures when, AFAICT, the original/decompressed file does not. i.e. this isn't a bug in libppl9. Strangely, unpacking the .deb, decompressing these files and then recompressing them with gzip -9nf changes the checksum of the .gz file *to match the other architectures*. e.g. the armel package has a bad .gz file, the armhf has a good one. the kfreebsd-amd64 package has a bad .gz file, the amd64 has a good one. If that matrix was flipped diagonally, it might make more sense. ;-) The bad checksums also *match* between armel and kfreebsd-amd64. armel, kfreebsd-amd64: 0e52e84eebf41588865742edaff7b3c0 usr/share/doc/libppl9/CREDITS.gz armhf, i386, amd64: 99e2b9f8972ce00cfe57e3735881015e usr/share/doc/libppl9/CREDITS.gz By bad, I mean that the .gz file, when decompressed and recompressed, changes checksum to match the other architecture. It appears to be a boolean change, not random or Nary. In this case, it also changes the filesize: armel, kfreebsd-amd64: 6344 2011-02-27 09:07 ./usr/share/doc/libppl9/CREDITS.gz armhf, i386, amd64: 6343 2011-02-27 09:07 ./usr/share/doc/libppl9/CREDITS.gz (Jakub Wilk originally spotted a checksum change without a filesize change, so filesize is not the best indicator, hence the checksum test.) Decompress and recompress the file from the kfreebsd-amd64 or armel packages on amd64 or armel and the filesize changes back to 6343 and the checksum changes to that of amd64/armhf/i386 etc. making the bug very hard to reproduce. The change does not happen in reverse, neither can I regenerate the .gz file with the original checksum on the architecture which showed the original problem. Once the bad checksum changes to the good one, repeating the compression retains the good checksum. (The .gz file with the changed checksum really is different - it is one byte larger and 3 bytes differ.) I've run the test script for a couple of hundred iterations and the checksum always changes after the first decompress+compress cycle but never changes back. So far, I've tried this on abel.debian.org, inside and outside the sid chroot, and on amd64. Either the armel or kfreebsd-amd64 package can be unpacked and the CREDITS.gz file decompressed and recompressed - the filesize and checksum change to the values seen on armhf and amd64. Can someone spot whether I've made a mess of the test script or whether there is something else going on here? http://people.debian.org/~codehelp/gzip.sh.txt http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=647522 It would be a very laborious task to check the md5sums of every .gz file in /usr/share/doc in every MultiArch: same package across all architectures and the Contents-* files on the mirrors don't contain the filesize of the listed files. Does anyone have ideas on how to scan the archive for this kind of problem? If we can't pin this down, it is going to make MultiArch very hard to deliver - any package build could make some MultiArch combinations uninstallable in ways that are very hard to detect in advance, causing entire dependency chains to fail to install. The manifestation of the issue in libppl9 is clear when trying to install the MultiArch build-dependencies for cross-compilers: $ sudo apt-get install libcloog-ppl-dev:armel Selecting previously unselected package libppl9:armel. (Reading database ... 167711 files and directories currently installed.) Unpacking libppl9:armel (from .../libppl9_0.11.2-6_armel.deb) ... dpkg: error processing /var/cache/apt/archives/libppl9_0.11.2-6_armel.deb (--unpack): './usr/share/doc/libppl9/CREDITS.gz' is different from the same file on the system This then leaves the installation in a broken state and needs careful manual intervention to remove the dependencies of the broken package as `apt-get -f install` wrongly tries to just reinstall the libppl9:armel package again. dpkg is correct in it's current handling - the files really are different. The problem is that the uncompressed file is not. Comment from Paul Effert: I should add that it's OK (from the point of view of the RFCs) if gzip produces different outputs given the same inputs when compressing. The RFCs allow that and presumably other gzip implementations do that. All that's required is that compress+decompress result in a copy of the original. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=647522#20 What we're seeing here are differences after decompress+compress but without a reproducible test for this bug,
Re: Please test gzip -9n - related to dpkg with multiarch support
On 07.02.2012 10:59, Neil Williams wrote: It would be a very laborious task to check the md5sums of every .gz file in /usr/share/doc in every MultiArch: same package across all architectures and the Contents-* files on the mirrors don't contain the filesize of the listed files. Does anyone have ideas on how to scan the archive for this kind of problem? This might be interesting http://lists.debian.org/debian-devel/2011/11/msg00508.html -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? signature.asc Description: OpenPGP digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Neil Williams wrote: I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. pristine-tar hat tricks[1] aside, none of gzip, bzip2, xz are required to always produce the same compressed file for a given input file, and I can tell you from experience that there is a wide amount of variation. If multiarch requires this, then its design is at worst broken, and at best, there will be a lot of coordination pain every time there is a new/different version of any of these that happens to compress slightly differently. Maybe there was a reason for Guillem to want to tread carefully. -- see shy jo [1] Tricks including but not limited to: Small binary diffs and embedding specific versions of these programs' compressors as needed. signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On 07.02.2012 18:07, Joey Hess wrote: Neil Williams wrote: I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. pristine-tar hat tricks[1] aside, none of gzip, bzip2, xz are required to always produce the same compressed file for a given input file, and I can tell you from experience that there is a wide amount of variation. If multiarch requires this, then its design is at worst broken, and at best, there will be a lot of coordination pain every time there is a new/different version of any of these that happens to compress slightly differently. This seems to be a rather common problem as evindenced by e.g. https://bugs.launchpad.net/ubuntu/+source/clutter-1.0/+bug/901522 https://bugs.launchpad.net/ubuntu/+source/libtasn1-3/+bug/889303 https://bugs.launchpad.net/ubuntu/oneiric/+source/pam/+bug/871083 In Ubuntu they started to work-around that by excluding random files from being compressed. So far I refused to add those hacks to the Debian package as this needs to be addressed properly. Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? signature.asc Description: OpenPGP digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
Hi Joey, On Tue, Feb 07, 2012 at 01:07:11PM -0400, Joey Hess wrote: Neil Williams wrote: I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. pristine-tar hat tricks[1] aside, none of gzip, bzip2, xz are required to always produce the same compressed file for a given input file, and I can tell you from experience that there is a wide amount of variation. If multiarch requires this, then its design is at worst broken, and at best, there will be a lot of coordination pain every time there is a new/different version of any of these that happens to compress slightly differently. The relevant multiarch invariants are: - if a multiarch package is to be installed for more than one architecture, all architectures of the package must be at the same version - a file shipped at the same location by more than one architecture of the package must be identical across all architectures These are reasonable constraints that spare the package manager from having to try to arbitrate which is the right version of the package/file, or distinguish between cases where the differences across architectures matter vs. cases where they don't. However, where this gets tangled is with the one compressed file required for all packages by policy: /usr/share/doc/$pkg/changelog(.Debian).gz. There are various ways to meet both the multiarch constraints and the policy requirements, including shipping an arch:all common package that will contain the changelog for each multiarch library, which then ships just a symlink. But that's a lot of package proliferation for a fairly small corner case. If we *could* ensure that the same input file produced the same output when compressed /with the same version of the tool/ regardless of architecture, that would be sufficient. (Having to occasionally do a sourceful upload due to gzip version skew across architectures is certainly much cheaper than the alternatives.) At this stage, I have no reason to think that's not achievable, though no one seems to have dived very deep into the bug yet. And whether gzip upstream agrees this is a reasonable invariant to uphold, I don't know. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developerhttp://www.debian.org/ slanga...@ubuntu.com vor...@debian.org signature.asc Description: Digital signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, 07 Feb 2012 19:11:16 +0100 Michael Biebl bi...@debian.org wrote: On 07.02.2012 18:07, Joey Hess wrote: Neil Williams wrote: I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. pristine-tar hat tricks[1] aside, none of gzip, bzip2, xz are required to always produce the same compressed file for a given input file, and I can tell you from experience that there is a wide amount of variation. If multiarch requires this, then its design is at worst broken, and at best, there will be a lot of coordination pain every time there is a new/different version of any of these that happens to compress slightly differently. Exactly. I'm not convinced that this is fixable at the gzip level, nor is it likely to be fixable by the trauma of changing from gzip to something else. That would be pointless. What matters, to me, is that package installations do not fail somewhere down the dependency chain in ways which are difficult to fix. Compression is used to save space, not to provide unique identification of file contents. As it is now clear that the compression is getting in the way of dealing with files which are (in terms of their actual *usable* content) identical, then the compression needs to be taken out of the comparison operation. Where the checksum matches that's all well and good (problems with md5sum collisions aside), where it does not match then dpkg cannot deem that the files conflict without creating a checksum based on the decompressed content of the two files. A checksum failure of a compressed file is clearly unreliable and will generate dozens of unreproducible bugs. MultiArch has many benefits but saving space is not why MultiArch exists and systems which will use MultiArch in anger will not be likely to be short of either RAM or swap space. Yes, the machines which are *targeted* by the builds which occur as a result of having MultiArch available for Emdebian will definitely be aimed at low resource devices but those devices do NOT need to actually use MultiArch themselves. In the parlance of --build, --host and autotools, MultiArch is a build tool, not a host mechanism. If you've got the resources to cross-build something, you have the resources to checksum the decompressed content of some files. As far as having MultiArch to install non-free i386 on amd64, it is less of a problem simply because the number of packages installed as MultiArch packages is likely to be a lot less. Even so, although the likelihood drops, the effect of one of these collisions getting through is the same. This seems to be a rather common problem as evindenced by e.g. https://bugs.launchpad.net/ubuntu/+source/clutter-1.0/+bug/901522 https://bugs.launchpad.net/ubuntu/+source/libtasn1-3/+bug/889303 https://bugs.launchpad.net/ubuntu/oneiric/+source/pam/+bug/871083 See the number of .gz files in this list: http://people.debian.org/~jwilk/multi-arch/same-md5sums.txt In Ubuntu they started to work-around that by excluding random files from being compressed. So far I refused to add those hacks to the Debian package as this needs to be addressed properly. Maybe the way to solve this properly is to remove compression from the uniqueness check - compare the contents of the file in memory after decompression. Yes, it will take longer but it is only needed when the md5sum (which already exists) doesn't match. The core problem is that the times when the md5sum of the compressed file won't match are unpredictable. No workaround is going to be reliable because there is no apparent logic to the files which become affected and any file which was affected at libfoo0_1.2.3 could well be completely blameless in libfoo0_1.2.3+b1. (binNMU's aren't the answer either because that could just as easily transfer the bug from libfoo0 to libfoo-dev and so on.) There appears to be plenty of evidence that checksums of compressed files are only useful until the checksums fail to match, at which point I think dpkg will just have to fall back to decompressing the contents in RAM / swap and doing a fresh checksum on the contents of each contentious compressed file. If the checksums of the contents match, the compressed file on the filesystem wins. Anything else and Debian loses all the reproducibility which is so important to developers and users. When I need to make a cross-building chroot from unstable (or write a tool for others to create such chroots), it can't randomly fail today, work tomorrow and fail with some other package the day after. If others agree, I think that bug #647522, currently open against gzip, could be reassigned to dpkg and retitled to not rely on checksums for compressed files when determining MultiArch file collisions. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgpapkAXqvTTW.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, 7 Feb 2012 14:01:57 -0800 Steve Langasek vor...@debian.org wrote: There are various ways to meet both the multiarch constraints and the policy requirements, including shipping an arch:all common package that will contain the changelog for each multiarch library, which then ships just a symlink. But that's a lot of package proliferation for a fairly small corner case. If we *could* ensure that the same input file produced the same output when compressed /with the same version of the tool/ regardless of architecture, that would be sufficient. (Having to occasionally do a sourceful upload due to gzip version skew across architectures is certainly much cheaper than the alternatives.) At this stage, I have no reason to think that's not achievable, though no one seems to have dived very deep into the bug yet. And whether gzip upstream agrees this is a reasonable invariant to uphold, I don't know. OK, I admit I haven't put evidence of such digging into the bug report currently, but it has been happening - with custom tools and upstream co-operation. There were a few pastebin links on IRC and a private IRC chat which I will try and summarise for the bug report tomorrow. My understanding, after a day testing this bug, is that we *cannot* ensure that the same input file always gives the same compressed file across all possible permutations. The RFC simply does not require it and the compression tools simply do not support it. It might be nice if they could but there is no real prospect that it will happen 100% of the time. Quite often it will work but that is coincidence and happen-stance. To rely on the checksums of compressed files being identical for all operations on the same original input file is simply not supportable by upstream, as I understand it currently. All that the RFC requires is that an input file can be compressed and the compressed file will *always* result in getting the original input file back. There is nothing about the state of the compressed file itself. It is merely an intermediary and I think we should think about treating it as such. -- Neil Williams = http://www.linux.codehelp.co.uk/ pgp2KUCCf2gTE.pgp Description: PGP signature
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, Feb 07, 2012 at 10:04:04PM +, Neil Williams wrote: On Tue, 07 Feb 2012 19:11:16 +0100 Michael Biebl bi...@debian.org wrote: On 07.02.2012 18:07, Joey Hess wrote: Neil Williams wrote: I'd like to ask for some help with a bug which is tripping up my tests with the multiarch-aware dpkg from experimental - #647522 - non-deterministic behaviour of gzip -9n. pristine-tar hat tricks[1] aside, none of gzip, bzip2, xz are required to always produce the same compressed file for a given input file, and I can tell you from experience that there is a wide amount of variation. If multiarch requires this, then its design is at worst broken, and at best, there will be a lot of coordination pain every time there is a new/different version of any of these that happens to compress slightly differently. Exactly. I'm not convinced that this is fixable at the gzip level, nor is it likely to be fixable by the trauma of changing from gzip to something else. That would be pointless. What matters, to me, is that package installations do not fail somewhere down the dependency chain in ways which are difficult to fix. Compression is used to save space, not to provide unique identification of file contents. As it is now clear that the compression is getting in the way of dealing with files which are (in terms of their actual *usable* content) identical, then the compression needs to be taken out of the comparison operation. Where the checksum matches that's all well and good (problems with md5sum collisions aside), where it does not match then dpkg cannot deem that the files conflict without creating a checksum based on the decompressed content of the two files. [...] But it's worse than this: even if dpkg decompresses before comparing, debsums won't (and mustn't, for backward compatibility). So it's potentially necessary to fix up the md5sums file for a package installed for multiple architectures, if it contains a file that was compressed differently. Ben. -- Ben Hutchings We get into the habit of living before acquiring the habit of thinking. - Albert Camus -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120207224923.gc12...@decadent.org.uk
Re: Please test gzip -9n - related to dpkg with multiarch support
Neil Williams codeh...@debian.org writes: Maybe the way to solve this properly is to remove compression from the uniqueness check - compare the contents of the file in memory after decompression. Yes, it will take longer but it is only needed when the md5sum (which already exists) doesn't match. Another possible solution is to just give any package an implicit Replaces (possibly constrained to /usr/share/doc) on any other package with the same name and version and a different architecture. This isn't as defensive, in that it doesn't catch legitimate bugs where someone has made a mistake and the packages contain different contents, but it also solves the binNMU issue (well, solves; the changelog will randomly swap back and forth between the packages, but I'm having a hard time being convinced this is a huge problem). -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87fwemz229@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, 07 Feb 2012, Ben Hutchings wrote: But it's worse than this: even if dpkg decompresses before comparing, debsums won't (and mustn't, for backward compatibility). So it's Maybe you can switch to sha256 and add the new functionality while at it? Detect which mode (md5sum raw, sha256 uncompress) by the size of the hash. Old debsums won't work with the new files, but is that really a problem? That's what stable updates and backports are for... -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120207234212.gb13...@khazad-dum.debian.net
Re: Please test gzip -9n - related to dpkg with multiarch support
Henrique de Moraes Holschuh wrote: Maybe you can switch to sha256 and add the new functionality while at it? Detect which mode (md5sum raw, sha256 uncompress) by the size of the hash. Old debsums won't work with the new files, but is that really a problem? That's what stable updates and backports are for... No, I do not believe random Debian-internal compatibility breaks are what stable updates are for. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120207234446.GB5928@burratino
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, 07 Feb 2012, Jonathan Nieder wrote: Henrique de Moraes Holschuh wrote: Maybe you can switch to sha256 and add the new functionality while at it? Detect which mode (md5sum raw, sha256 uncompress) by the size of the hash. Old debsums won't work with the new files, but is that really a problem? That's what stable updates and backports are for... No, I do not believe random Debian-internal compatibility breaks are what stable updates are for. If dpkg or debsums in stable are going to get in the way, IMO it is. But let's not go down that discussion in this thread. -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120207234903.gc13...@khazad-dum.debian.net
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, Feb 07, 2012 at 10:13:01PM +, Neil Williams wrote: On Tue, 7 Feb 2012 14:01:57 -0800 Steve Langasek vor...@debian.org wrote: At this stage, I have no reason to think that's not achievable, though no one seems to have dived very deep into the bug yet. And whether gzip upstream agrees this is a reasonable invariant to uphold, I don't know. My understanding, after a day testing this bug, is that we *cannot* ensure that the same input file always gives the same compressed file across all possible permutations. The RFC simply does not require it and the compression tools simply do not support it. It might be nice if they could but there is no real prospect that it will happen 100% of the time. Quite often it will work but that is coincidence and happen-stance. To rely on the checksums of compressed files being identical for all operations on the same original input file is simply not supportable by upstream, as I understand it currently. The RFC doesn't require it, but as far as I see gzip doesn't use randomness, time, uninitialised memory or anything else which might cause it ending up with an different compression result in 1 case out of 1. Understanding why this happens should be the prerequisite for deciding what do about this issue. If it turns out not reasonable to expect the compression results to be identical, we should probably look into using dpkg --path-exclude= with /usr/share/{doc,man,info}/* when installing foreign-architecture packages. Very few Multi-Arch: same packages need to install identical compressed files outside these directories. In case it happens, the package needs to use multiarch paths or split files to -common package. The ugliness of this solution is that the specialness of /usr/share/doc and others needs to embedded into the package system somewhere. Riku -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120208005252.ga2...@afflict.kos.to
Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 8 Feb 2012, Riku Voipio riku.voi...@iki.fi wrote: If it turns out not reasonable to expect the compression results to be identical It was reported that sometimes the size differs. Surely if nothing else having gzip sometimes produce an unnecessarily large file is a bug! Expecting that the compression gives the smallest file every time is reasonable. -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/201202081205.44427.russ...@coker.com.au
Re: Please test gzip -9n - related to dpkg with multiarch support
Riku Voipio riku.voi...@iki.fi writes: If it turns out not reasonable to expect the compression results to be identical, we should probably look into using dpkg --path-exclude= with /usr/share/{doc,man,info}/* when installing foreign-architecture packages. I believe the only packages that pose a problem are those marked Multi-Arch: same, allowing multiple architectures of the same package to be installed, and those packages are almost all shared libraries. (Among other things, nothing with a non-arch-qualified binary in bin or sbin directories can be Multi-Arch: same anyway, which rules out most non-shared-library packages other than things like cross-compilers.) libraries are already not allowed to ship files in /usr/share/man and /usr/share/info unless they change with every SONAME bump or we can't have coinstallability of multiple SONAMEs of a package. I think the problem is mostly limited to /usr/share/doc; the remaining corner cases look pretty rare. -- Russ Allbery (r...@debian.org) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87sjimxgub@windlord.stanford.edu
Re: Please test gzip -9n - related to dpkg with multiarch support
On Tue, Feb 07, 2012 at 10:49:23PM +, Ben Hutchings wrote: But it's worse than this: even if dpkg decompresses before comparing, debsums won't (and mustn't, for backward compatibility). So it's potentially necessary to fix up the md5sums file for a package installed for multiple architectures, if it contains a file that was compressed differently. I'm uncomfortable with the idea of checking checksums only for uncompressed data. Compressed files have headers, and at least for some formats, it seems those headers can contain essentially arbitrary data. This allows compressed files to be modified in rather significant ways, without debsums noticing, if debsums uncompresses before comparing. Further, uncompressors have the potential for security problems. See https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2009-2624 for example. In other words: debsums needs to decompress to verify that no files have been tampered with, but doing so can invoke an attack. Such an attack may be unlikely, but it would seem to be a better design to not open up the possibility for it. -- http://www.kickstarter.com/projects/docstory/mix-1-2-albanian signature.asc Description: Digital signature