Bug#944965: debsums: Script accesses internal dpkg database

2020-05-22 Thread Axel Beckert
Control: clone -1 -2
Control: reassign -2 dpkg
Control: retitle -2 dpkg: Please provide a command-line option to show md5sums 
for multiple or all packages with one call
Control: submitter -2 !
Control: block -1 by -2

Hi Guillem,

while trying to check what's left to implement this, I got reminded by
this source code comment which currently IMHO very well validates
direct database usage (as you seem to have noticed, too):

# Calling dpkg-query --control-path for every package is too slow,
# so we cheat a little bit.

Guillem Jover wrote:
> The debsums program should be switched to use something like:
> 
>   «dpkg-query --control-show $pkg md5sums»
> 
> to get the md5sums file contents. [...] While this is not ideal,
> because this interface does not allow batching, at least it will
> stop accessing the internal database. I will be adding in the near
> future a new virtual field to dpkg-query to be able to fetch all
> md5sums for all packages with something like:
> 
>   «dpkg-query \
> --showformat 'Package: ${Package}\nMd5sums: ${db-fsys:Md5sums}\n' \
> --show»

So please do so. Cloning this bug as a reminder and blocker.

Will for now upload debsums with just debsums_init removed, so at
least a bit of this issue is resolved that way.

> The other question though, is whether it still makes sense to ship
> debsums, with «dpkg --audit» checking for missing md5sums files,
> «dpkg --verify» checking for hash mismatches, and «dpkg --unpack»
> generating these when the to be installed does not provide one?

>From my point of view alone the different (IMHO way more comfortable)
command-line user interface and more readable output still validates
it's existence. (I really dislike that rpm format. :-)

I also haven't found an quick and easy way to just show conffiles or
non-conffiles with dpkg --verify on a quick glance. Even to grep out
the just changed ones is not that trivial.

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#944965: [Piuparts-devel] Bug#944965: debsums: Script accesses internal dpkg database

2020-05-22 Thread Andreas Beckmann
On 22/05/2020 03.10, Axel Beckert wrote:
> I though see one problem with this: IIRC, the piuparts guys sometimes
> use debsums as backport on stable.
> 
> Cc'ing them if this is still the case and if they see any issues if a
> debsums upload in the not-so-far future would depend on a very recent
> dpkg version. (IIRC last time they needed a debsums backport, this was
> because we fixed bugs they wanted to be fixed in their setups as soon
> as they were found and fixed.)

That was for the introduction of the --ignore-obsolete option which
allowed us to get rid of many false positives (and finally make debsums
errors fail the piuparts test). We don't need any newer debsums than that.

Andreas



Bug#944965: debsums: Script accesses internal dpkg database

2020-05-21 Thread Axel Beckert
Control: tag -1 - moreinfo + confirmed

Hi Guillem,

Guillem Jover wrote:
> > Guillem Jover wrote:
> > > This package contains the «debsums» program, which directly accesses
> > > the dpkg internal database, instead of using one of the public
> > > interfaces provided by dpkg.
> > 
> > JFTR: This is not true. I didn't find a single place in the debsums
> > script where $admindir is accessed directly. Instead it is always
> > passed to a dpkg, dpkg-query or dpkg-divert call as you asked for.
> 
> Well I see in debsums the md5sums_path() function which does access
> it.

Granted. I just looked for $admindir, but not $DPKG. My fault. Thanks
for pointing out this detail.

> Ideally debsums would only pass the admindir if it has been specified.
> And then it would also only use --root on dpkg commands if that's what
> has been passed to it, which would imply no need for a hard-coded
> dpkg database pathname.

Being able to only use --root would be very preferable indeed.

> Of course one problem is that dpkg-query does not have a --root
> option! But I think I have a branch somewhere implementing that, so
> I'll add this to 1.20.1. :)

Much appreciated!

I though see one problem with this: IIRC, the piuparts guys sometimes
use debsums as backport on stable.

Cc'ing them if this is still the case and if they see any issues if a
debsums upload in the not-so-far future would depend on a very recent
dpkg version. (IIRC last time they needed a debsums backport, this was
because we fixed bugs they wanted to be fixed in their setups as soon
as they were found and fixed.)

> > Leaves the build-time configuration of the admindir: How can I query
> > dpkg for the build-time location of its admindir?
> > 
> > And how can I determine the admindir of a chroot with a call to an
> > external dpkg binary outside the chroot, which, as I understand you,
> > can have a different admindir.
> 
> So with the above, the idea would be that you do not need to.

Yep. --root for the win! :-D

That's definitely the way to go.

> > I just tried "dpkg-query --control-show sendfile md5sums" in a minimal
> > pbuilder chroot where I just installed sendfile to see how that error
> > looks like.
> > 
> > To my surprise, despite sendfile_2.1b.20080616-6_amd64.deb does not
> > contain a files with md5sums, "dpkg-query --control-show sendfile
> > md5sums" works and /var/lib/dpkg/info/sendfile.md5sums exists.
> > 
> > So it seems as if dpkg now automatically generates md5sums files if
> > not present. Just checked dpkg's changelog and this feature seems to
> > exist since 2012.
> > 
> > Which means that debsums_init is actually obsolete since 2012.
> > 
> > So I will happily remove debsums_init with the next upload.:-)
> 
> Yes, thanks. :)
> 
> > > If the file is missing an error will be returned.
> > 
> > So how can this file be missing if dpkg generates them?
> 
> That would be the case if a package had been installed with an ancient
> dpkg version and then never upgraded.

sendfile actually was a good candidate (the last maintainer upload
before 2020 was in 2011), but it got adopted recently and also had
some NMUs between 2012 and 2014. Of course this wouldn't have made a
difference in a just unpacked pbuilder chroot. :-)

> But as mentioned above «dpkg -C» will complain, so I'd leave it to
> the user to handle TBH. Also because generating the md5sums from the
> installed files is a bit misleading as if they have changed then
> they will be "bogus", I mean I guess this is better than nothing,
> but not ideal.

Full ack. debsums_init did the very same for the very same reason,
just about 5 years before dpkg did. (According to the changelog it got
introduced in 2007.) This also explains very nicely why the according
lintian warning is still important.

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#944965: debsums: Script accesses internal dpkg database

2020-05-21 Thread Guillem Jover
Hi!

On Fri, 2020-05-22 at 01:06:51 +0200, Axel Beckert wrote:
> Control: tag -1 moreinfo

> Guillem Jover wrote:
> > This package contains the «debsums» program, which directly accesses
> > the dpkg internal database, instead of using one of the public
> > interfaces provided by dpkg.
> 
> JFTR: This is not true. I didn't find a single place in the debsums
> script where $admindir is accessed directly. Instead it is always
> passed to a dpkg, dpkg-query or dpkg-divert call as you asked for.

Well I see in debsums the md5sums_path() function which does access
it. And the debsums_init program does too indeed. :)

> The only script which accesses *.md5sums files and only to see if they
> exist, is debsums_init which is meant to be removed anyway, once
> https://lintian.debian.org/tags/no-md5sums-control-file.html is down
> to zero as it actually generates that file. But since there are
> currently over 60 packages on that list, this won't be anytime soon.

As you have noticed (down below), dpkg has been generating missing
md5sum on installation for some time, so this functionality does not
seem necessary anymore. Also «dpkg -C» will warn about packages that
are still missing such files so that they can be reinstalled.

> > The admindir can also be configured differently at dpkg build or
> > run-time.
> 
> Well, that's exactly what we do: We configure dpkg's admindir at
> run-time!

> W only use $admindir and provide it to dpkg as parameter because
> debsums supports to also check chroots. And since chroots might be of
> a different architecture (or for forensic purposes), we don't want to
> use the dpkg binary inside the chroot, i.e. we need to provide at
> least the location to dpkg. And for that, we need to know it.

My point is that if dpkg has been built with a different admindir
default (not the case in Debian, but perhaps a derivative system)
then debsums passing that pathname will make dpkg operate on an
invalid database (and with newer versions it will simply consider
that a bootstrapping installation and proceed as if it had 0 packages
installed).

Ideally debsums would only pass the admindir if it has been specified.
And then it would also only use --root on dpkg commands if that's what
has been passed to it, which would imply no need for a hard-coded
dpkg database pathname.

Of course one problem is that dpkg-query does not have a --root
option! But I think I have a branch somewhere implementing that, so
I'll add this to 1.20.1. :)

> Leaves the build-time configuration of the admindir: How can I query
> dpkg for the build-time location of its admindir?
> 
> And how can I determine the admindir of a chroot with a call to an
> external dpkg binary outside the chroot, which, as I understand you,
> can have a different admindir.

So with the above, the idea would be that you do not need to.

> > The debsums program should be switched to use something like:
> > 
> >   «dpkg-query --control-show $pkg md5sums»
> >
> > to get the md5sums file contents. If the file is missing an error
> > will be returned.
> 
> I just tried "dpkg-query --control-show sendfile md5sums" in a minimal
> pbuilder chroot where I just installed sendfile to see how that error
> looks like.
> 
> To my surprise, despite sendfile_2.1b.20080616-6_amd64.deb does not
> contain a files with md5sums, "dpkg-query --control-show sendfile
> md5sums" works and /var/lib/dpkg/info/sendfile.md5sums exists.
> 
> So it seems as if dpkg now automatically generates md5sums files if
> not present. Just checked dpkg's changelog and this feature seems to
> exist since 2012.
> 
> Which means that debsums_init is actually obsolete since 2012.
> 
> So I will happily remove debsums_init with the next upload.:-)

Yes, thanks. :)

> > If the file is missing an error will be returned.
> 
> So how can this file be missing if dpkg generates them?

That would be the case if a package had been installed with an ancient
dpkg version and then never upgraded. But as mentioned above «dpkg -C»
will complain, so I'd leave it to the user to handle TBH. Also because
generating the md5sums from the installed files is a bit misleading as
if they have changed then they will be "bogus", I mean I guess this is
better than nothing, but not ideal.

Thanks,
Guillem



Bug#944965: debsums: Script accesses internal dpkg database

2020-05-21 Thread Axel Beckert
Control: tag -1 moreinfo

Dear Guillem,

Guillem Jover wrote:
> This package contains the «debsums» program, which directly accesses
> the dpkg internal database, instead of using one of the public
> interfaces provided by dpkg.

JFTR: This is not true. I didn't find a single place in the debsums
script where $admindir is accessed directly. Instead it is always
passed to a dpkg, dpkg-query or dpkg-divert call as you asked for.

The only script which accesses *.md5sums files and only to see if they
exist, is debsums_init which is meant to be removed anyway, once
https://lintian.debian.org/tags/no-md5sums-control-file.html is down
to zero as it actually generates that file. But since there are
currently over 60 packages on that list, this won't be anytime soon.

> The admindir can also be configured differently at dpkg build or
> run-time.

Well, that's exactly what we do: We configure dpkg's admindir at
run-time!

W only use $admindir and provide it to dpkg as parameter because
debsums supports to also check chroots. And since chroots might be of
a different architecture (or for forensic purposes), we don't want to
use the dpkg binary inside the chroot, i.e. we need to provide at
least the location to dpkg. And for that, we need to know it.

Leaves the build-time configuration of the admindir: How can I query
dpkg for the build-time location of its admindir?

And how can I determine the admindir of a chroot with a call to an
external dpkg binary outside the chroot, which, as I understand you,
can have a different admindir.

> The debsums program should be switched to use something like:
> 
>   «dpkg-query --control-show $pkg md5sums»
>
> to get the md5sums file contents. If the file is missing an error
> will be returned.

I just tried "dpkg-query --control-show sendfile md5sums" in a minimal
pbuilder chroot where I just installed sendfile to see how that error
looks like.

To my surprise, despite sendfile_2.1b.20080616-6_amd64.deb does not
contain a files with md5sums, "dpkg-query --control-show sendfile
md5sums" works and /var/lib/dpkg/info/sendfile.md5sums exists.

So it seems as if dpkg now automatically generates md5sums files if
not present. Just checked dpkg's changelog and this feature seems to
exist since 2012.

Which means that debsums_init is actually obsolete since 2012.

So I will happily remove debsums_init with the next upload.:-)

> If the file is missing an error will be returned.

So how can this file be missing if dpkg generates them?

Regards, Axel
-- 
 ,''`.  |  Axel Beckert , https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-|  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



Bug#944965: debsums: Script accesses internal dpkg database

2019-11-17 Thread Guillem Jover
Source: debsums
Source-Version: 2.2.4
Severity: important
User: debian-d...@lists.debian.org
Usertags: dpkg-db-access-blocker

Hi!

This package contains the «debsums» program, which directly accesses
the dpkg internal database, instead of using one of the public
interfaces provided by dpkg.

The debsums program should be switched to use something like:

  «dpkg-query --control-show $pkg md5sums»

to get the md5sums file contents. If the file is missing an error will
be returned. While this is not ideal, because this interface does not
allow batching, at least it will stop accessing the internal database.
I will be adding in the near future a new virtual field to dpkg-query
to be able to fetch all md5sums for all packages with something like:

  «dpkg-query \
--showformat 'Package: ${Package}\nMd5sums: ${db-fsys:Md5sums}\n' \
--show»

The other question though, is whether it still makes sense to ship
debsums, with «dpkg --audit» checking for missing md5sums files,
«dpkg --verify» checking for hash mismatches, and «dpkg --unpack»
generating these when the to be installed does not provide one?


This is a problem for several reasons, because even though the layout and
format of the dpkg database is administrator friendly, and it is expected
that those might need to mess with it, in case of emergency, this
“interface” does not extend to other programs besides the dpkg suite of
tools. The admindir can also be configured differently at dpkg build or
run-time. And finally, the contents and its format, will be changing in
the near future.

Thanks,
Guillem