Hi Fabian, On Wed, Feb 21, 2018 at 10:04:42AM +0100, Fabian Groffen wrote:
> To give you an idea; the current tree is getting its Manifests from > hashgen.c, which you can find in scripts/rsync-generation/hashgen.c. > The hashverify tool, which I'm currently working on, is basically an > addition to that file (doing argv[0] detection) to perform the > verification. At this time of writing, I have the gpg-verification and > single file entry verification in place. I'm still trying to close the > gap in checking the dirs in particular looking for files that are not > listed in the Manifest. I've been following or rather post-reading the discussion about GLEP-74 and it seems a lot of thought and security considerations have gone into it. Are you following it for your implementation? I see you're using OpenMP. That looks very neat and could certainly make it scale better than gemato. BTW: I remember OpenMP being plain missing from clang until recently. Ah, https://clang-omp.github.io/ says 3.7 and onwards have native support. > hashgen currently runs in 30s or so on the tree to generate manifests. > I hope it can verify in the same amount of time (we're talking about a > Quad G5 PowerPC machine here with rusty old spinning disks), leaving it > in a much better position to be used for Prefix, since we tend to have > slower/older machines around. I can only believe that number if the portage tree fits into the fs cache (RAM). Then it would be purely CPU-bound, I guess, and C could play out its advantage. On a machine that has just synced the tree using rsync it would also still be in RAM as a working set (if there was sufficient RAM) and the same consideration would apply. On my MacBook Air I get 625012k from du for the tree. So this should be alright for any machine with 1GB+ of RAM that's not doing anything else at the time. As soon as the tree doesn't completely fit into RAM, the later files of the rsync will push the first files out of the cache. So for verification, the reading will start from scratch, making the verification I/O-bound and the advantage of a C implementation mostly irrelevant, won't it? As a quick number: On an ARM SBC of mine with a Dual-Core 1.2GHz Cortex-A7, 1GB of RAM and fs on a microSD card that does 22MB/s sequential read I get 400 seconds flat for the first run of gemato verify and 386 for the second when the fs cache is primed already. I do see a bit of I/O-boundness on the first run and on the second, gemato fully hogs the CPU. It would be interesting to compare how much of that is python overhead and how much the actual crypto. Unfortunately the gcc on that board doesn't have openmp support (yet). gemato create runs 667s on the same board and falls back to a single thread for some reason about a minute in. So I'd certainly take a 22fold speedup if I could get it. :) > We're not really looking forward to 15 > minutes of verification as some bugs have been reported to with gemato. Well, with webrsync I was looking at around three minutes of download and five minutes of unpacking and local rsync on my ARM SBCs. With gemato it's now about 30 seconds to three minutes of rsyncing depending on the amount of changes and six minutes of gemato checking. So basically I've neither gained nor lost anything but feel much more efficient by once again not downloading what hasn't changed. It also shows that I have a high pain threshold in dealing with Gentoo on small and old machines. Considering how long Gentoo users typically wait for compiles to finish I wouldn't have expected a bit of a wait for gemato or hashverify to be much of an issue. > Portage used to do 1) checking the digests of ebuilds and 2) checking > for missing and extra files. I noticed that at least 1) is no longer > present, which I find weird. I need checking this on normal Gentoo, > (simply edit an ebuild and try to emerge it without updating its > digest), but I have the suspicion this got disabled because the full > tree verification should catch this. Needless to say, that's > suboptimal, and not very secure IMO. Ah, now I understand what you were getting at. I just verified: Both checks are working on Gentoo Linux but not in Prefix Mac. It seems to have to do with setting "thin-manifests" in layout.conf of the prefix tree. Once I set that to false, prefix portage behaves the same as Gentoo Linux portage. Gentoo Linux: [root@linux:/usr/portage/xfce-base/garcon] sha512sum garcon-0.6.1.ebuild dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7 garcon-0.6.1.ebuild [root@linux:/usr/portage/xfce-base/garcon] grep EBUILD.*0.6.1.*SHA512 Manifest EBUILD garcon-0.6.1.ebuild 1018 SHA512 dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7 [root@linux:/usr/portage/xfce-base/garcon] sed -i -e "s,econf,fooconf," garcon-0.6.1.ebuild [root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild compile * Digest verification failed: * /usr/portage/xfce-base/garcon/garcon-0.6.1.ebuild * Reason: Filesize does not match recorded size * Got: 1020 * Expected: 1018 [root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild digest >>> Creating Manifest for /usr/portage/xfce-base/garcon [root@linux:/usr/portage/xfce-base/garcon] mkdir files [root@linux:/usr/portage/xfce-base/garcon] echo foo > files/foo [root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild prepare * garcon-0.6.1.tar.bz2 BLAKE2B SHA512 size ;-) ... [ ok ] * checking ebuild checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] !!! A file is not listed in the Manifest: '/usr/portage/xfce-base/garcon/files/foo' [root@linux:/usr/portage/xfce-base/garcon] touch garcon-0.7.0.ebuild [root@linux:/usr/portage/xfce-base/garcon] ebuild garcon-0.6.1.ebuild prepare * A file is not listed in the Manifest: '/usr/portage/xfce-base/garcon/garcon-0.7.0.ebuild' [root@linux:~] grep thin /usr/portage/metadata/layout.conf # Use thin Manifests for Git thin-manifests = false Prefix Mac: root@mac:/gentoo/usr/portage/xfce-base/garcon $ sha512sum garcon-0.6.1.ebuild dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7 garcon-0.6.1.ebuild root@mac:/gentoo/usr/portage/xfce-base/garcon $ grep EBUILD.*0.6.1.*SHA512 Manifest EBUILD garcon-0.6.1.ebuild 1018 SHA512 dfa80c3e8c766af3d170536f6d7c48793da6bccd4f1647ac8194130a521eb852bf6872514a75e8da7d0eec02eca53d76e431c4371c6f9c891a68fa516fdca8b7 root@mac:/gentoo/usr/portage/xfce-base/garcon $ sed -i -e "s,econf,fooconf," garcon-0.6.1.ebuild root@mac:/gentoo/usr/portage/xfce-base/garcon $ sha512sum garcon-0.6.1.ebuild 42f04d921b82955f8ac6237b4f2ebd78fad1cb0f2ab3cbf0f11b4b055b5a20c0cba6f2a37639b759a8912ec728bbde2f491b57d7565066f3ef5f4587e8f787a5 garcon-0.6.1.ebuild root@mac:/gentoo/usr/portage/xfce-base/garcon $ ebuild garcon-0.6.1.ebuild compile >>> Downloading 'http://distfiles.gentoo.org/distfiles/garcon-0.6.1.tar.bz2' [...] root@mac:/gentoo/usr/portage/xfce-base/garcon $ grep -r thin /gentoo/usr/portage/metadata/layout.conf # Use thin Manifests for Git thin-manifests = true root@mac:/gentoo/usr/portage/xfce-base/garcon $ sed -i -e "s,thin-manifests = true,thin-manifests = false," /gentoo/usr/portage/metadata/layout.conf root@mac:/gentoo/usr/portage/xfce-base/garcon $ ebuild garcon-0.6.1.ebuild prepare >>> Existing ${T}/environment for 'garcon-0.6.1' will be sourced. Run >>> 'clean' to start with a fresh environment. * Missing digest for '/gentoo/usr/portage/xfce-base/garcon/garcon-0.6.1.ebuild' root@nindamos:/usr/local/gentoo/usr/portage/xfce-base/garcon $ mkdir files root@nindamos:/usr/local/gentoo/usr/portage/xfce-base/garcon $ touch files/foo root@nindamos:/usr/local/gentoo/usr/portage/xfce-base/garcon $ ebuild garcon-0.6.1.ebuild prepare * garcon-0.6.1.tar.bz2 BLAKE2B SHA512 size ;-) ... [ ok ] * checking ebuild checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] !!! A file is not listed in the Manifest: '/usr/local/gentoo/usr/portage/xfce-base/garcon/files/foo' > Now I may be all wrong with trying to implement the verification myself, > but that's a separate topic. gemato should work fine. I think it's a worthy cause just for the fact that multiple implementations of a protocol are always good to drive improvement and show up the corner cases. Maybe it shouldn't be hidden away in the prefix tree so others can more easily become aware of and contribute to it. > I've checked some of the digests you mentioned and they look ok. So I'm > wondering whether perhaps you got caught in the middle of a sync. This > used to be much less of a problem because of per-ebuild-dir integrity, > but now the entire tree requires to be consistent. I'll look into > re-activating my symlink-flip, which should make the switch atomic, but > I don't know what rsync is doing if the symlink is flipped during a > sync. It reduces the invalid window somewhat I guess. I've retried a couple of times today. I deleted timestamp.chk several times and resynced until I saw no delta any more. There actually was a delta on the second and sometimes third run, mostly Manifest.gzs and metadata. It still fails reliably on second-level Manifests, e.g. dev-dotnet/Manifest.gz and sys-libs/Manifest.gz It looks as if regeneration of Manifests and md5-cache runs almost continuously, not just every 26,56 of the hour as rsync's motd says. Is that intentional or are perhaps multiple instances of the script running amok? -- Micha Elephants don't play chess!