Hi! On Wed, 2019-05-08 at 19:38:26 +0200, Adam Borowski wrote: > First, the 0.939 format, as described in "man deb-old". While still being > accepted by dpkg, it had been superseded before even the very first stable > release. Why? It has at least two upsides over 2.0:
I'll try to detangle the discussion and address this first. Some of what I'm going to write has already been writen in the thread, but I'm just going to condense and give it some additional context and lay down the direction I'd like to go with. To recap, format 0.93x has multiple problems: - Cannot be handled with stock tools. - Not easily extensible. - Bad data alignment. - Bad commpression support. - Bad tool coverage (see below). I don't think it's correct that most tools support that format, from the list of programs that I've tracked that handle .deb directly, I'd even say almost none do <https://wiki.debian.org/Teams/Dpkg/DebSupport>. This list does not include many projects/programs not within Debian handling .deb archives directly. The size limit is indeed a problem, and was already known and tracked in deb(5) and <https://wiki.debian.org/Teams/Dpkg/TimeTravelFixes>, see the “.deb size limit” item there, and then later discussed in <https://lists.debian.org/debian-dpkg/2016/05/msg00027.html>. And while I think the workarounds I listed there are probably still valid in most cases, if this is affecting people then prioritizing fixing it now would be good. The crazy idea I came up with at the time was to use a dual-format PAX+ar container (that would embed the ar(5) header in the first PAX name entry). This would make old tools at least detect this is a .deb package, with a higher major version. <https://lists.debian.org/debian-dpkg/2016/06/msg00005.html> But I guess I was never sold on it either, and thinking about it, the tradeoff does not really look very good. file(1) does not even recognize it out-of-the-box as a .deb anyway, and we'd just get a nicer error message from some of the tools handling .debs, but all of them need to be updated anyway to support any new format. It also destroys some of the nice properties of the 2.x format, namely: - Not requiring special tools to build/extract. - Using a non-widespread format (PAX). Getting rid of ar(5) also would make the format more portable, as the ar(5) format does change depending on the Unix system! Even besides the main common format and its BSD and GNU variants, there are other wildly different layouts. It would also mean we do not need binutils to analyze them when there is no dpkg-deb around. For the same reason using PAX would probably be a bad idea, as it's a format that has unfortunately not really caught up, and takes more space due to the additional headers, and we do not really need xattr in the contains. I went for that for its unlimited length metadata, but since dpkg 1.18.24 that should not be an issue as I implemented GNU large file metadata support which means we have pretty much "unlimited" length metadata, and I'd say its encoding is more widespread than PAX (for example star supports it). So I think Andrej is on the spot, and we should just switch from ar(5) to tar(5) as the container, but not to PAX, just the GNU extensions we already support, which would only be used when necessary. And ignore any crazy idea of embedding an ar header inside the first member, as that will just complicate matters and be cruft once we have switched. So given that we'd need to modify any program handling .debs directly anyway, I'd go for the most straightforward and simple of the options. I'll propose an actual diff I've got here of deb(5) tomorrow, but otherwise if there are no great concerns, I'd like to start adding support for this for dpkg 1.20.x. Thanks, Guillem

