Hi Fabian,

On Wed, Feb 21, 2018 at 10:04:42AM +0100, Fabian Groffen wrote:

> To give you an idea; the current tree is getting its Manifests from
> hashgen.c, which you can find in scripts/rsync-generation/hashgen.c.
> The hashverify tool, which I'm currently working on, is basically an
> addition to that file (doing argv[0] detection) to perform the
> verification.  At this time of writing, I have the gpg-verification and
> single file entry verification in place.  I'm still trying to close the
> gap in checking the dirs in particular looking for files that are not
> listed in the Manifest.

I've been following or rather post-reading the discussion about GLEP-74
and it seems a lot of thought and security considerations have gone into
it. Are you following it for your implementation?

I see you're using OpenMP. That looks very neat and could certainly make
it scale better than gemato.

BTW: I remember OpenMP being plain missing from clang until recently.
Ah, https://clang-omp.github.io/ says 3.7 and onwards have native
support.

> hashgen currently runs in 30s or so on the tree to generate manifests.
> I hope it can verify in the same amount of time (we're talking about a
> Quad G5 PowerPC machine here with rusty old spinning disks), leaving it
> in a much better position to be used for Prefix, since we tend to have
> slower/older machines around.

I can only believe that number if the portage tree fits into the fs
cache (RAM). Then it would be purely CPU-bound, I guess, and C could
play out its advantage.

On a machine that has just synced the tree using rsync it would also
still be in RAM as a working set (if there was sufficient RAM) and the
same consideration would apply. On my MacBook Air I get 625012k from du
for the tree. So this should be alright for any machine with 1GB+ of RAM
that's not doing anything else at the time.

As soon as the tree doesn't completely fit into RAM, the later files of
the rsync will push the first files out of the cache. So for
verification, the reading will start from scratch, making the
verification I/O-bound and the advantage of a C implementation mostly
irrelevant, won't it?

As a quick number: On an ARM SBC of mine with a Dual-Core 1.2GHz
Cortex-A7, 1GB of RAM and fs on a microSD card that does 22MB/s
sequential read I get 400 seconds flat for the first run of gemato
verify and 386 for the second when the fs cache is primed already. I do
see a bit of I/O-boundness on the first run and on the second, gemato
fully hogs the CPU. It would be interesting to compare how much of that
is python overhead and how much the actual crypto. Unfortunately the gcc
on that board doesn't have openmp support (yet).

gemato create runs 667s on the same board and falls back to a single
thread for some reason about a minute in. So I'd certainly take a
22fold speedup if I could get it. :)

> We're not really looking forward to 15
> minutes of verification as some bugs have been reported to with gemato.

Well, with webrsync I was looking at around three minutes of download
and five minutes of unpacking and local rsync on my ARM SBCs. With
gemato it's now about 30 seconds to three minutes of rsyncing depending
on the amount of changes and six minutes of gemato checking. So
basically I've neither gained nor lost anything but feel much more
efficient by once again not downloading what hasn't changed.

It also shows that I have a high pain threshold in dealing with Gentoo
on small and old machines. Considering how long Gentoo users typically
wait for compiles to finish I wouldn't have expected a bit of a wait for
gemato or hashverify to be much of an issue.

> Portage used to do 1) checking the digests of ebuilds and 2) checking
> for missing and extra files.  I noticed that at least 1) is no longer
> present, which I find weird.  I need checking this on normal Gentoo,
> (simply edit an ebuild and try to emerge it without updating its
> digest), but I have the suspicion this got disabled because the full
> tree verification should catch this.  Needless to say, that's
> suboptimal, and not very secure IMO.

Ah, now I understand what you were getting at. I just verified: Both
checks are working on Gentoo Linux but not in Prefix Mac. It seems to
have to do with setting "thin-manifests" in layout.conf of the prefix
tree. Once I set that to false, prefix portage behaves the same as
Gentoo Linux portage.

Gentoo Linux:

[root@linux:/usr/portage/xfce-base/garcon] sha512sum garcon-0.6.1.ebuild 
dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7
 garcon-0.6.1.ebuild
[root@linux:/usr/portage/xfce-base/garcon] grep EBUILD.*0.6.1.*SHA512 Manifest 
EBUILD garcon-0.6.1.ebuild 1018 SHA512 
dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7
[root@linux:/usr/portage/xfce-base/garcon] sed -i -e "s,econf,fooconf," 
garcon-0.6.1.ebuild 
[root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild compile
 * Digest verification failed:
 * /usr/portage/xfce-base/garcon/garcon-0.6.1.ebuild
 * Reason: Filesize does not match recorded size
 * Got: 1020
 * Expected: 1018

[root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild digest
>>> Creating Manifest for /usr/portage/xfce-base/garcon
[root@linux:/usr/portage/xfce-base/garcon] mkdir files
[root@linux:/usr/portage/xfce-base/garcon] echo foo > files/foo
[root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild prepare
 * garcon-0.6.1.tar.bz2 BLAKE2B SHA512 size ;-) ...
[ ok ]
 * checking ebuild checksums ;-) ...
[ ok ]
 * checking miscfile checksums ;-) ...
[ ok ]
!!! A file is not listed in the Manifest:
'/usr/portage/xfce-base/garcon/files/foo'
[root@linux:/usr/portage/xfce-base/garcon] touch garcon-0.7.0.ebuild
[root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild prepare
 * A file is not listed in the Manifest: 
'/usr/portage/xfce-base/garcon/garcon-0.7.0.ebuild'
[root@linux:~] grep thin /usr/portage/metadata/layout.conf
# Use thin Manifests for Git
thin-manifests = false

Prefix Mac:

root@mac:/gentoo/usr/portage/xfce-base/garcon $ sha512sum garcon-0.6.1.ebuild 
dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7
 garcon-0.6.1.ebuild
root@mac:/gentoo/usr/portage/xfce-base/garcon $ grep EBUILD.*0.6.1.*SHA512 
Manifest 
EBUILD garcon-0.6.1.ebuild 1018 SHA512 
dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7
root@mac:/gentoo/usr/portage/xfce-base/garcon $ sed -i -e "s,econf,fooconf," 
garcon-0.6.1.ebuild 
root@mac:/gentoo/usr/portage/xfce-base/garcon $ sha512sum garcon-0.6.1.ebuild 
42f04d921b82955f8ac6237b4f2ebd78fad1cb0f2ab3cbf0f11b4b055b5a20c0cba6f2a37639b759a8912ec728bbde2f491b57d7565066f3ef5f4587e8f787a5
 garcon-0.6.1.ebuild
root@mac:/gentoo/usr/portage/xfce-base/garcon $ ebuild garcon-0.6.1.ebuild 
compile
>>> Downloading 'http://distfiles.gentoo.org/distfiles/garcon-0.6.1.tar.bz2'
[...]

root@mac:/gentoo/usr/portage/xfce-base/garcon $ grep -r thin 
/gentoo/usr/portage/metadata/layout.conf 
# Use thin Manifests for Git
thin-manifests = true
root@mac:/gentoo/usr/portage/xfce-base/garcon $ sed -i -e "s,thin-manifests = 
true,thin-manifests = false," /gentoo/usr/portage/metadata/layout.conf 

root@mac:/gentoo/usr/portage/xfce-base/garcon $ ebuild garcon-0.6.1.ebuild 
prepare
>>> Existing ${T}/environment for 'garcon-0.6.1' will be sourced. Run
>>> 'clean' to start with a fresh environment.
 * Missing digest for '/gentoo/usr/portage/xfce-base/garcon/garcon-0.6.1.ebuild'

root@nindamos:/usr/local/gentoo/usr/portage/xfce-base/garcon $ mkdir files
root@nindamos:/usr/local/gentoo/usr/portage/xfce-base/garcon $ touch files/foo
root@nindamos:/usr/local/gentoo/usr/portage/xfce-base/garcon $ ebuild 
garcon-0.6.1.ebuild prepare
 * garcon-0.6.1.tar.bz2 BLAKE2B SHA512 size ;-) ...
[ ok ]
 * checking ebuild checksums ;-) ...
[ ok ]
 * checking miscfile checksums ;-) ...
[ ok ]
!!! A file is not listed in the Manifest: 
'/usr/local/gentoo/usr/portage/xfce-base/garcon/files/foo'

> Now I may be all wrong with trying to implement the verification myself,
> but that's a separate topic.  gemato should work fine.

I think it's a worthy cause just for the fact that multiple
implementations of a protocol are always good to drive improvement and
show up the corner cases.

Maybe it shouldn't be hidden away in the prefix tree so others can more
easily become aware of and contribute to it.

> I've checked some of the digests you mentioned and they look ok.  So I'm
> wondering whether perhaps you got caught in the middle of a sync.  This
> used to be much less of a problem because of per-ebuild-dir integrity,
> but now the entire tree requires to be consistent.  I'll look into
> re-activating my symlink-flip, which should make the switch atomic, but
> I don't know what rsync is doing if the symlink is flipped during a
> sync.  It reduces the invalid window somewhat I guess.

I've retried a couple of times today. I deleted timestamp.chk several
times and resynced until I saw no delta any more. There actually was a
delta on the second and sometimes third run, mostly Manifest.gzs and
metadata. It still fails reliably on second-level Manifests, e.g.
dev-dotnet/Manifest.gz and sys-libs/Manifest.gz 

It looks as if regeneration of Manifests and md5-cache runs almost
continuously, not just every 26,56 of the hour as rsync's motd says. Is
that intentional or are perhaps multiple instances of the script running
amok?
-- 
Micha
Elephants don't play chess!

Reply via email to