Julian Foad wrote on Tue, Jan 25, 2022 at 21:43:44 +0000:
> Daniel Shahaf wrote:
> > Julian Foad wrote on Thu, Jan 20, 2022 at 21:03:02 +0000:
> >> The only case in which a simple per-WC setting might be unsatisfactory
> >> is the following combination:
> >  
> > Why would it be the only case?
> 
> I assert that per-WC control suffices if any of the conditions I listed
> is false.

I understood the form of your argument; I just didn't understand why the
argument was correct.  Saying that the case you've outlined is the
_only_ one, that there isn't _any_ exception, is a non-trivial claim.
(For instance, that's exactly the claim to fame of the Ω(n log n) lower
bound on comparison-based sorting.)

> > I agree that that subset's pristines are necessarily able to be stored
> > locally at least from time to time, but no more than that.  It's not
> > _necessarily_ posssible to store those files' pristines permanently [...]
> 
> You rightly point out that cases may exist where the pristines-wanted
> subset is only needed some of the time, and the rest of the time it's
> important to recover that space for other uses. That implies the
> pristines-wanted subset is "huge" -- otherwise by definition the space
> they occupy would not be unacceptable to store permanently.
> 
> When you need those pristines, it would therefore be OK to disable
> pristines-on-demand for the whole WC, because that isn't hugely worse
> than if you could choose just the subset. (Saving a minority of
> the pristines space is not a driving requirement for this feature, even
> if it would be nice to have.)

Haven't you just moved your goalposts?  I quote:

> > >    - the WC data set is "huge" […] in total; and
> > >    - there is a subset of files […]
> > >    - that subset of files is not "huge" in total; and

The subset of the files was "not 'huge' in total" upthread and is
responsible for "a minority of the pristine space" here.  Which is it?
We can't agree on handling this use-case until we agree on what this
use-case is.

> In those cases, switching the WC between pristines-present and
> pristines-on-demand would be necessary. Such "switching" is probably a
> strong requirement anyway, even outside this case, as I should think it
> would be considered poor UX if it were not possible to change one's mind
> without a re-checkout.

Even if it's poor UX, we should still ask whether this poor UX would or
wouldn't be a good exchange for the engineering effort of implementing
toggleability.  That's comparable to how having «svn upgrade» and
«svnadmin upgrade» at all means there could be bugs that affect upgraded
wc's/repositories but not new ones.  (For extra fun, the bug could be
latent and only surface after a further in-place upgrade.  Debian has
had such bugs, e.g., 
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620958#30>.)

> > Let me try to sketch a use-case for wanting only _some_ files to be
> > pristineless. [...]
> 
> I don't dispute that some cases exist where it would be nice to have
> per-file control. I still see it as merely "nice" and still do not see
> how it could be considered essential or very important.
> 

Hang on.  Why do you assume that if someone has big files, then they're
necessarily all out in a one directory and all the accompanying texty
(or otherwise diffable) files are all in another directory?  Sure,
that's exactly kfogel's use-case (described upthread), but it's not the
only way to structure a repository.

For instance, take our own /repos/asf/subversion/site/publish/download
and /repos/dist/release/subversion.  Those are separate repositories,
but Subversion (the software) does not dictate that.  If Infra had
decided to do things differently, to put the artifacts in the /site
directory, then only dev@ subscribers who participate in "… up for
signing/testing" threads would have had a reason to download the full
/site tree; everyone else (say, translators) would have needed only the
texty bits.  [That's actually a use-case for server-provided viewspecs;
and «svn checkout --depth=infinity» would override them…]

> It's not completely clear to me what you mean to draw out in your
> 'libsvn*.so' example. It seems to be a case where the user wants
> efficient 'commit' of a few files which are large enough to care about
> that operation (let's assume they are diffable enough for their
> pristines to be useful) -- but make up only a small subset of the total
> WC size so omitting pristines of the majority of the WC, which is huge,
> would be important to save space. Yes, that's a case where subset
> control would be nice.
> 
> But I would argue to that case, there are alternative and even better
> solutions than managing pristines. The user could make the WC shallow
> instead, omitting the pristines *and* working files of releases branches
> they don't currently need to work on while behind the narrow downlink.

My use-case involved a user who wished to have 1.13's binaries available
to them offline (so they could reproduce and prepare fixes on the road).
Your proposed workflow does not support the assumed user's workflow, so
I don't see how it is a "better solution".

When all is said and done, if you want to make a fuel-efficient ambulance,
you look into extracting more joules of mechanical work out of each
litre of fuel.  You don't just park the ambulance for a day a week.

> Or they could have their main WC pristine-less and check out a separate
> WC, with pristines, containing just the minority parts that they need offline.

This argument is also an argument for closing issue #525 as WONTFIX.
"Use «export» rather than «checkout» and keep a parallel depth=empty
working copy wherein you'll pull (using «svn update --parents») just the
files you'll need."  Want «svn status»?  Use lndir(1) and «find ./
-type f».  Want «svn diff»?  Use «zfs snapshot» and diff(1).  Want «svn
update»?  Just «export» again since we assume the network connection is
wide and cheap and the files are undeltifiable.

For the last one, if we don't assume a wide and cheap downlink, we can
think of adding an «svn update --rather-than-download-pristines,-copy-
the-following-file-into-the-pristine-store-if-its-sha1-matches-a-sha1-of-
a-file-the-server-says-it\'s-about-to-send-us=/some/local/path» option,
which would do what its name says.  It'd be similar to «rsync --copy-dest».

And if we do make that assumption, one could probably implement a FUSE
filesystem that fetches pristine files on-demand.  There's exactly such
a solution linked from #525 ("scord"), but it was never updated for
wc-ng (svn ≥1.7).

> > Which brings me to a less contrived / more general point: What if the
> > user _knows in advance_ they'll need a pristine?  Shouldn't there be: —
> >  
> > - a way to say "I'm about to change a large, diffable file; detranslate
> >  it into the pristine store before I touch it"?  Perhaps even make
> >  files read-only at the OS level (as with svn:needs-lock) [...]?
> 
> > - a way to say "[...] download a pristine for this file now"?
> 
> > - «svn commit --keep-pristines» [...]?
> 
> At one level these are some logical extensions to the control that users
> would have over the pristine-management process. These additional
> controls might be valuable in certain cases.
> 
> In the context of the main driving use cases (fast connectivity to the
> repo) these would be marginal tweaks with no real benefit. They could
> have real benefits in the scenarios that we looked at above where there
> is neither plenty space nor plenty connectivity, and when per-file
> control of pristines is available.
> 
> We should consider making sure the API exposes these operations to
> keep/fetch/store pristines so that they could potentially be added to
> the UI of clients later. The 'svn' client would not necessarily ever
> want to expose this degree of control: it's likely too much to add to
> the user's cognitive load. It seems more something that certain scripts
> and clients built for automation tasks might benefit from, so might make
> sense just in APIs and bindings.

Agree that we should keep these in mind when designing the API.  As to
whether these belong in svn(1), in the API, in tools/, or in third-party
tools, we can cross that bridge when we come to it.

Cheers,

Daniel

Reply via email to