On 30-06-2020 13:13:29 -0500, Sid Spry wrote: > On Tue, Jun 30, 2020, at 1:20 AM, Fabian Groffen wrote: > > Hi, > > > > On 29-06-2020 21:13:43 -0500, Sid Spry wrote: > > > Hello, > > > > > > I have some runnable pseudocode outlining a faster tree verification > > > algorithm. > > > Before I create patches I'd like to see if there is any guidance on > > > making the > > > changes as unobtrusive as possible. If the radical change in algorithm is > > > acceptable I can work on adding the changes. > > > > > > Instead of composing any kind of structured data out of the portage tree > > > my > > > algorithm just lists all files and then optionally batches them out to > > > threads. > > > There is a noticeable speedup by eliding the tree traversal operations > > > which > > > can be seen when running the algorithm with a single thread and comparing > > > it to > > > the current algorithm in gemato (which should still be discussed here?). > > > > I remember something that gemato used to use multiple threads, but > > because it totally saturated disk-IO, it was brought back to a single > > thread. People were complaining about unusable systems. > > > > I think this is an argument for cgroups limits support on the portage process > or > account as opposed to an argument against picking a better algorithm. That is > something I have been working towards, but I am only one man.
But this requires a) cgroups support, and b) the privileges to use it. Shouldn't be a problem in the normal case, but just saying. > > In any case, can you share your performance results? What speedup did > > you see, on warm and hot FS caches? Which type of disk do you use? > > > > I ran all tests multiple times to make them warm off of a Samsung SSD, but > nothing very precise yet. > > % gemato verify --openpgp-key signkey.asc /var/db/repos/gentoo > [...] > INFO:root:Verifying /var/db/repos/gentoo... > INFO:root:/var/db/repos/gentoo verified in 16.45 seconds > > sometimes going higher, closer to 18s, vs. > > % ./veriftree.py > 4.763171965983929 > > So roughly an order of magnitude speedup without batching to threads. That is kind of a change. Makes one wonder if you really did the same work. > > You could compare against qmanifest, which uses OpenMP-based > > paralllelism while verifying the tree. On SSDs this does help. > > > > I lost my notes -- how do I specify to either gemato or qmanifest the GnuPG > directory? My code is partially structured as it is because I had problems > doing > this. I rediscovered -K/--openpgp-key in gemato but am unsure for qmanifest. qmanifest doesn't do much magic out of the standard gnupg practices. (It is using gpgme.) If you want it to use a different gnupg dir, you may change HOME, or GNUPGHOME. Thanks, Fabian -- Fabian Groffen Gentoo on a different level
signature.asc
Description: PGP signature