Den ons 17 juni 2020 kl 17:04 skrev Marc Espie <es...@nerim.net>:

>
> > > > > The concept you need to understand is snapshot shearing.
> > > > > A full package snapshot is large enough that it's hard to
> guarantee that
> > > > > you will have a full snapshot on a mirror at any point in time.
> > > > > In fact, you will sometimes encounter a mix of two snapshots (not
> that often,
> > > > > recently, but still)
> > > > > Hence, the decision to not have a central index for all packages,
> but to
> > > > > keep (and trust) the actual meta-info within the packages proper.
> > > >
> > > > Sorry, I guess I should've responded to this as well. Isn't snapshot
> shearing going to be a problem regardless of the existence of a single
> central-index? For instance, pkg_add notices a chromium update, which
> requires a newer version of a dependency that hasn't been propagated to the
> mirror yet.
>
> > Even with snapshot shearing though, having this index file could provide
> a substantial speed upgrade. Instead of having to check *all* installed
> package's header for updates, you could use the index to know the subset of
> packages that you expect to have actually changed, and only download
> *those* packages' headers. If the expected "combined" sha of a given
> package doesn't match the index's version, then the mirror is clearly out
> of sync and we could abort an update as usual.
>
>
Do think of what you call "the index file" in terms of "I check/replace
some 100+G of snapshots and packages every 24h", at which point do you
replace that single file, before, under or after none,most,all packages for
your arch are replaced? Will it be synced when the copying passes "i" for
"index.txt" in that packages folder?

What happens if a sync gets cut off, restarted and/or if two syncs suddenly
run into eachother and replace files as they go?

What if a new batch of amd64/i386 files appears while one of the ongoing
syncs run, do you restart over and hope yet another new one doesn't appear
while that one is running?

This is the reality of snapshot package today:

du -sh snapshots/packages/*
34.5G snapshots/packages/aarch64
52.4G snapshots/packages/amd64
18.4G snapshots/packages/arm
44.1G snapshots/packages/i386
24.1G snapshots/packages/mips64
10.2G snapshots/packages/mips64el
26.7G snapshots/packages/powerpc
25.4G snapshots/packages/sparc64

Whatever limitations Marcs design has, it makes it possible for us
mirror admins to sync with some kind of best-effort while still giving most
openbsd users the ability to have pkg_add -u leave you with a working
package eco-system on a daily basis. If the cost is that it takes 40
minutes at night from crontab, then I would not trade a greppable file for
losing some or a lot of the above-mentioned gotchas that the current system
somehow actually handles.

Now if someone invents a decent piece of code to use http connection
pooling, quic/http3/rsync or whatever to speed up getting the required
info, I'm sure we mirror admins would be happy to add/edit our server
programs to serve it.

-- 
May the most significant bit of your life be positive.

Reply via email to