On Wed, Nov 23, 2022 at 9:53 AM Julian Foad <julianf...@apache.org> wrote:
> Nathan, I see you replied enthusiastically and mentioned "I have much to
> say on both of these [TODOs] but I won't go into detail yet...". It
> seems to me it could be helpful to get that started sooner rather than
> later, too, if those issues still need hashing out.


Thanks for the nudge.

Previously we got stuck trying to choose the user-facing name of this
feature and its command line switches.

Currently the CLI switch is --store-pristine={yes|no}.

I'm okay with this, but for completeness I'll mention that earlier in
the year there was a little bit of push back because pristines, up
until now, have been an internal implementation detail that users
needn't concern themselves with. (Except that they double the storage
space...)

I've been trying to think of something better for months now, and
here's what I've come up with:

--optimize=storage
--optimize=network

Rationale:

* Self-documenting.

* Easy to explain: --optimize=storage saves storage space;
  --optimize=network reduces network accesses to the repository
  server.

* Users don't need to know about pristines. There aren't several levels
  of abstraction between the option name and why the user cares about
  it.

* Extensible. Maybe we can think of other ways to optimize for network
  bandwidth, for example.

The docs can give more user-facing explanation, including tradeoffs,
which SVN operations are affected, and example scenarios to help users
choose. It should be much easier to write -- and read -- than what we
currently have at the draft release notes [1].

As for example scenarios, while the original premise was to save space
on large files that don't change often, i525pod is also great in other
situations, such as checking out a large source tree on a ramdrive
(limited space), or on the same machine as the repo, or on a storage-
limited embedded device. (I've tried i525pod in all 3 of these
scenarios!)

Downsides:

* Admittedly, --optimize=network isn't the best name in all scenarios.
  Notably, this is a misnomer when the repository server is on the same
  machine as the working copy, but that might not matter because it's
  the default. (And I might suggest trying --optimize=storage in that
  scenario).

* If we ever want to do other cool things with pristines, such as an
  option to keep more locally cached history, these names won't be
  right for that.

* These option names haven't helped me come up with a better name for
  the feature itself.

There is an advantage to using --store-pristine={yes|no}: We don't need
to rename the feature because Pristines On Demand and the CLI options
are named similarly.

The disadvantage of --store-pristine={yes|no} is that the feature is
more burdensome for us to explain and for others to learn about,
especially from a non-technical standpoint. How would you explain this
feature in a press release, or in a short blurb (or dare I say, tweet)
about "What's new in Subversion 1.15?"

Some other possibilities that were discussed:

I'll mention these for completeness but note that if --optimize=x is
shot down, I'd rather use --store-pristine={yes|no} than any of these:

* Hydrate and dehydrate -- perhaps the terms that appear most in dev
  discussions. I don't recommend these in user-facing areas because
  they aren't self-documenting. Users can't deduce what these actually
  do for the user. Users might mistakenly think that their working
  files would be hydrated or dehydrated in some way. Users would have
  to learn about pristines to know what is being hydrated or
  dehydrated, eliminating any useful abstraction.

* "Bare working copies" -- the draft release notes [1] use this term
  tentatively to explain that "bare" working copies save storage by not
  caching "BASE" files. Unfortunately, "bare" and "BASE" differ by only
  one letter (and capitalization) and I feel like the explanation is
  too complicated and doesn't bring us closer to a good result.

* Briefly discussed: "local BASE" or "remote BASE" -- but that's a
  misnomer because there's no such thing as "remote" BASE.

Well, you've been warned that I have much to say. :-)

Cheers,
Nathan

Reply via email to