Bug#944965: debsums: Script accesses internal dpkg database
Control: clone -1 -2 Control: reassign -2 dpkg Control: retitle -2 dpkg: Please provide a command-line option to show md5sums for multiple or all packages with one call Control: submitter -2 ! Control: block -1 by -2 Hi Guillem, while trying to check what's left to implement this, I got reminded by this source code comment which currently IMHO very well validates direct database usage (as you seem to have noticed, too): # Calling dpkg-query --control-path for every package is too slow, # so we cheat a little bit. Guillem Jover wrote: > The debsums program should be switched to use something like: > > «dpkg-query --control-show $pkg md5sums» > > to get the md5sums file contents. [...] While this is not ideal, > because this interface does not allow batching, at least it will > stop accessing the internal database. I will be adding in the near > future a new virtual field to dpkg-query to be able to fetch all > md5sums for all packages with something like: > > «dpkg-query \ > --showformat 'Package: ${Package}\nMd5sums: ${db-fsys:Md5sums}\n' \ > --show» So please do so. Cloning this bug as a reminder and blocker. Will for now upload debsums with just debsums_init removed, so at least a bit of this issue is resolved that way. > The other question though, is whether it still makes sense to ship > debsums, with «dpkg --audit» checking for missing md5sums files, > «dpkg --verify» checking for hash mismatches, and «dpkg --unpack» > generating these when the to be installed does not provide one? >From my point of view alone the different (IMHO way more comfortable) command-line user interface and more readable output still validates it's existence. (I really dislike that rpm format. :-) I also haven't found an quick and easy way to just show conffiles or non-conffiles with dpkg --verify on a quick glance. Even to grep out the just changed ones is not that trivial. Regards, Axel -- ,''`. | Axel Beckert , https://people.debian.org/~abe/ : :' : | Debian Developer, ftp.ch.debian.org Admin `. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5 `-| 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Bug#944965: [Piuparts-devel] Bug#944965: debsums: Script accesses internal dpkg database
On 22/05/2020 03.10, Axel Beckert wrote: > I though see one problem with this: IIRC, the piuparts guys sometimes > use debsums as backport on stable. > > Cc'ing them if this is still the case and if they see any issues if a > debsums upload in the not-so-far future would depend on a very recent > dpkg version. (IIRC last time they needed a debsums backport, this was > because we fixed bugs they wanted to be fixed in their setups as soon > as they were found and fixed.) That was for the introduction of the --ignore-obsolete option which allowed us to get rid of many false positives (and finally make debsums errors fail the piuparts test). We don't need any newer debsums than that. Andreas
Bug#944965: debsums: Script accesses internal dpkg database
Control: tag -1 - moreinfo + confirmed Hi Guillem, Guillem Jover wrote: > > Guillem Jover wrote: > > > This package contains the «debsums» program, which directly accesses > > > the dpkg internal database, instead of using one of the public > > > interfaces provided by dpkg. > > > > JFTR: This is not true. I didn't find a single place in the debsums > > script where $admindir is accessed directly. Instead it is always > > passed to a dpkg, dpkg-query or dpkg-divert call as you asked for. > > Well I see in debsums the md5sums_path() function which does access > it. Granted. I just looked for $admindir, but not $DPKG. My fault. Thanks for pointing out this detail. > Ideally debsums would only pass the admindir if it has been specified. > And then it would also only use --root on dpkg commands if that's what > has been passed to it, which would imply no need for a hard-coded > dpkg database pathname. Being able to only use --root would be very preferable indeed. > Of course one problem is that dpkg-query does not have a --root > option! But I think I have a branch somewhere implementing that, so > I'll add this to 1.20.1. :) Much appreciated! I though see one problem with this: IIRC, the piuparts guys sometimes use debsums as backport on stable. Cc'ing them if this is still the case and if they see any issues if a debsums upload in the not-so-far future would depend on a very recent dpkg version. (IIRC last time they needed a debsums backport, this was because we fixed bugs they wanted to be fixed in their setups as soon as they were found and fixed.) > > Leaves the build-time configuration of the admindir: How can I query > > dpkg for the build-time location of its admindir? > > > > And how can I determine the admindir of a chroot with a call to an > > external dpkg binary outside the chroot, which, as I understand you, > > can have a different admindir. > > So with the above, the idea would be that you do not need to. Yep. --root for the win! :-D That's definitely the way to go. > > I just tried "dpkg-query --control-show sendfile md5sums" in a minimal > > pbuilder chroot where I just installed sendfile to see how that error > > looks like. > > > > To my surprise, despite sendfile_2.1b.20080616-6_amd64.deb does not > > contain a files with md5sums, "dpkg-query --control-show sendfile > > md5sums" works and /var/lib/dpkg/info/sendfile.md5sums exists. > > > > So it seems as if dpkg now automatically generates md5sums files if > > not present. Just checked dpkg's changelog and this feature seems to > > exist since 2012. > > > > Which means that debsums_init is actually obsolete since 2012. > > > > So I will happily remove debsums_init with the next upload.:-) > > Yes, thanks. :) > > > > If the file is missing an error will be returned. > > > > So how can this file be missing if dpkg generates them? > > That would be the case if a package had been installed with an ancient > dpkg version and then never upgraded. sendfile actually was a good candidate (the last maintainer upload before 2020 was in 2011), but it got adopted recently and also had some NMUs between 2012 and 2014. Of course this wouldn't have made a difference in a just unpacked pbuilder chroot. :-) > But as mentioned above «dpkg -C» will complain, so I'd leave it to > the user to handle TBH. Also because generating the md5sums from the > installed files is a bit misleading as if they have changed then > they will be "bogus", I mean I guess this is better than nothing, > but not ideal. Full ack. debsums_init did the very same for the very same reason, just about 5 years before dpkg did. (According to the changelog it got introduced in 2007.) This also explains very nicely why the according lintian warning is still important. Regards, Axel -- ,''`. | Axel Beckert , https://people.debian.org/~abe/ : :' : | Debian Developer, ftp.ch.debian.org Admin `. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5 `-| 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Bug#944965: debsums: Script accesses internal dpkg database
Hi! On Fri, 2020-05-22 at 01:06:51 +0200, Axel Beckert wrote: > Control: tag -1 moreinfo > Guillem Jover wrote: > > This package contains the «debsums» program, which directly accesses > > the dpkg internal database, instead of using one of the public > > interfaces provided by dpkg. > > JFTR: This is not true. I didn't find a single place in the debsums > script where $admindir is accessed directly. Instead it is always > passed to a dpkg, dpkg-query or dpkg-divert call as you asked for. Well I see in debsums the md5sums_path() function which does access it. And the debsums_init program does too indeed. :) > The only script which accesses *.md5sums files and only to see if they > exist, is debsums_init which is meant to be removed anyway, once > https://lintian.debian.org/tags/no-md5sums-control-file.html is down > to zero as it actually generates that file. But since there are > currently over 60 packages on that list, this won't be anytime soon. As you have noticed (down below), dpkg has been generating missing md5sum on installation for some time, so this functionality does not seem necessary anymore. Also «dpkg -C» will warn about packages that are still missing such files so that they can be reinstalled. > > The admindir can also be configured differently at dpkg build or > > run-time. > > Well, that's exactly what we do: We configure dpkg's admindir at > run-time! > W only use $admindir and provide it to dpkg as parameter because > debsums supports to also check chroots. And since chroots might be of > a different architecture (or for forensic purposes), we don't want to > use the dpkg binary inside the chroot, i.e. we need to provide at > least the location to dpkg. And for that, we need to know it. My point is that if dpkg has been built with a different admindir default (not the case in Debian, but perhaps a derivative system) then debsums passing that pathname will make dpkg operate on an invalid database (and with newer versions it will simply consider that a bootstrapping installation and proceed as if it had 0 packages installed). Ideally debsums would only pass the admindir if it has been specified. And then it would also only use --root on dpkg commands if that's what has been passed to it, which would imply no need for a hard-coded dpkg database pathname. Of course one problem is that dpkg-query does not have a --root option! But I think I have a branch somewhere implementing that, so I'll add this to 1.20.1. :) > Leaves the build-time configuration of the admindir: How can I query > dpkg for the build-time location of its admindir? > > And how can I determine the admindir of a chroot with a call to an > external dpkg binary outside the chroot, which, as I understand you, > can have a different admindir. So with the above, the idea would be that you do not need to. > > The debsums program should be switched to use something like: > > > > «dpkg-query --control-show $pkg md5sums» > > > > to get the md5sums file contents. If the file is missing an error > > will be returned. > > I just tried "dpkg-query --control-show sendfile md5sums" in a minimal > pbuilder chroot where I just installed sendfile to see how that error > looks like. > > To my surprise, despite sendfile_2.1b.20080616-6_amd64.deb does not > contain a files with md5sums, "dpkg-query --control-show sendfile > md5sums" works and /var/lib/dpkg/info/sendfile.md5sums exists. > > So it seems as if dpkg now automatically generates md5sums files if > not present. Just checked dpkg's changelog and this feature seems to > exist since 2012. > > Which means that debsums_init is actually obsolete since 2012. > > So I will happily remove debsums_init with the next upload.:-) Yes, thanks. :) > > If the file is missing an error will be returned. > > So how can this file be missing if dpkg generates them? That would be the case if a package had been installed with an ancient dpkg version and then never upgraded. But as mentioned above «dpkg -C» will complain, so I'd leave it to the user to handle TBH. Also because generating the md5sums from the installed files is a bit misleading as if they have changed then they will be "bogus", I mean I guess this is better than nothing, but not ideal. Thanks, Guillem
Bug#944965: debsums: Script accesses internal dpkg database
Control: tag -1 moreinfo Dear Guillem, Guillem Jover wrote: > This package contains the «debsums» program, which directly accesses > the dpkg internal database, instead of using one of the public > interfaces provided by dpkg. JFTR: This is not true. I didn't find a single place in the debsums script where $admindir is accessed directly. Instead it is always passed to a dpkg, dpkg-query or dpkg-divert call as you asked for. The only script which accesses *.md5sums files and only to see if they exist, is debsums_init which is meant to be removed anyway, once https://lintian.debian.org/tags/no-md5sums-control-file.html is down to zero as it actually generates that file. But since there are currently over 60 packages on that list, this won't be anytime soon. > The admindir can also be configured differently at dpkg build or > run-time. Well, that's exactly what we do: We configure dpkg's admindir at run-time! W only use $admindir and provide it to dpkg as parameter because debsums supports to also check chroots. And since chroots might be of a different architecture (or for forensic purposes), we don't want to use the dpkg binary inside the chroot, i.e. we need to provide at least the location to dpkg. And for that, we need to know it. Leaves the build-time configuration of the admindir: How can I query dpkg for the build-time location of its admindir? And how can I determine the admindir of a chroot with a call to an external dpkg binary outside the chroot, which, as I understand you, can have a different admindir. > The debsums program should be switched to use something like: > > «dpkg-query --control-show $pkg md5sums» > > to get the md5sums file contents. If the file is missing an error > will be returned. I just tried "dpkg-query --control-show sendfile md5sums" in a minimal pbuilder chroot where I just installed sendfile to see how that error looks like. To my surprise, despite sendfile_2.1b.20080616-6_amd64.deb does not contain a files with md5sums, "dpkg-query --control-show sendfile md5sums" works and /var/lib/dpkg/info/sendfile.md5sums exists. So it seems as if dpkg now automatically generates md5sums files if not present. Just checked dpkg's changelog and this feature seems to exist since 2012. Which means that debsums_init is actually obsolete since 2012. So I will happily remove debsums_init with the next upload.:-) > If the file is missing an error will be returned. So how can this file be missing if dpkg generates them? Regards, Axel -- ,''`. | Axel Beckert , https://people.debian.org/~abe/ : :' : | Debian Developer, ftp.ch.debian.org Admin `. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5 `-| 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
Bug#944965: debsums: Script accesses internal dpkg database
Source: debsums Source-Version: 2.2.4 Severity: important User: debian-d...@lists.debian.org Usertags: dpkg-db-access-blocker Hi! This package contains the «debsums» program, which directly accesses the dpkg internal database, instead of using one of the public interfaces provided by dpkg. The debsums program should be switched to use something like: «dpkg-query --control-show $pkg md5sums» to get the md5sums file contents. If the file is missing an error will be returned. While this is not ideal, because this interface does not allow batching, at least it will stop accessing the internal database. I will be adding in the near future a new virtual field to dpkg-query to be able to fetch all md5sums for all packages with something like: «dpkg-query \ --showformat 'Package: ${Package}\nMd5sums: ${db-fsys:Md5sums}\n' \ --show» The other question though, is whether it still makes sense to ship debsums, with «dpkg --audit» checking for missing md5sums files, «dpkg --verify» checking for hash mismatches, and «dpkg --unpack» generating these when the to be installed does not provide one? This is a problem for several reasons, because even though the layout and format of the dpkg database is administrator friendly, and it is expected that those might need to mess with it, in case of emergency, this “interface” does not extend to other programs besides the dpkg suite of tools. The admindir can also be configured differently at dpkg build or run-time. And finally, the contents and its format, will be changing in the near future. Thanks, Guillem