Clarify whether svnadmin commands are safe to run on a live repo

2024-04-17 Thread Julian Foad
Yesterday I was pleased to notice that we had gained an "svnadmin 
build-repcache" command in 1.14, thanks to Denis Kovalchuk of VisualSVN.



https://subversion.apache.org/docs/release-notes/1.14#svnadmin-build-repcache
  https://svn.apache.org/r1875921

To an inexperienced svn admin, this feels like the kind of procedure 
that falls in the "repository repair" category, of procedures that are 
not necessarily safe to run unless one knows the intimate details of how 
it all works.


An administrator needs to know in particular whether they must stop the 
server before running this. It isn't stated in the help or the release 
notes. An explicit statement would help.


Looking at the code, I see it calls svn_repos_open3() which, according 
to its doc string, blocks until any exclusive lock is gone and then 
acquires a shared lock on the repo. That sounds to me like it's safe to 
run on a "live" repo.


Do I read that right?

Index: subversion/svnadmin/svnadmin.c
===
--- subversion/svnadmin/svnadmin.c  (revision 1915904)
+++ subversion/svnadmin/svnadmin.c  (working copy)
@@ -311,8 +311,12 @@
 "usage: svnadmin build-repcache REPOS_PATH [-r LOWER[:UPPER]]\n"
 "\n"), N_(
 "Add missing entries to the representation cache for the repository\n"
 "at REPOS_PATH. Process data in revisions LOWER through UPPER.\n"
 "If no revision arguments are given, process all revisions. If only\n"
 "LOWER revision argument is given, process only that single 
revision.\n"

+"\n"), N_(
+"This is safe to run on a repository that is in use. It holds a 
shared\n"

+"lock while it runs, blocking commits. It may hold the lock for a\n"
+"considerable time. It is safe to interrupt and run again.\n"
)},
{'r', 'q', 'M'} },

Would this wording be useful and correct?

Then I found myself wondering the same question about all the other 
svnadmin commands. Without tracing all the code, it "feels" to me like 
this is the common pattern used in svnadmin and so probably the same or 
similar statements apply to most or all of its subcommands. Does anyone 
want to check the facts, or express an opinion about whether we should 
update all the help strings or if we can add a single statement in 
"svnadmin help" that covers all of them?


- Julian


--
- Matrix me: @julian:foad.me.uk 
- Follow me: @jul...@fed.foad.me.uk 


Re: svn extension to access OAuth2-protected repositories

2024-04-04 Thread Julian Foad

Evan,

I for one think it would be great for Subversion to have more modern 
auth options. Thank you for presenting your proof of concept of such an 
extension.


As to your direct question, while it's possible someone might pick up 
the idea and run with it, in all honesty it is most likely you will have 
to push it along yourself. If you propose small changes and ask small 
specific questions, and show willingness to figure out the rest as you 
already have been doing, that way in general you are most likely to 
receive answers and help. You are off to a good start by presenting 
complete scripts and recipes that anyone could follow. That's a very 
good way to keep proceeding, and I would expect in due course you might 
at least get someone trying it and reporting back. I hope this provides 
you some encouragement.


- Julian

--
- Matrix me: @julian:foad.me.uk 
- Follow me: @jul...@fed.foad.me.uk 



Re: Removing the old move-tracking experiment

2024-02-20 Thread Julian Foad

Nathan Hartman wrote:

I am okay with this (conceptually [...]).

More specifically, I am okay with removing it from trunk, but may I 
suggest moving it to a branch, e.g., 'svnmover'? [...]


Thank you for your thoughts on this, Nathan.  I went ahead, removing it 
from trunk (r1915902) and making a branch 'move-tracking-3' (r1915903) 
with a BRANCH-README (1915904).  (The previous two branches in that 
series are earlier related work.)


- Julian

--
- Matrix me: @julian:foad.me.uk 
- Follow me: @jul...@fed.foad.me.uk 



Removing the old move-tracking experiment

2024-02-19 Thread Julian Foad
 subversion/tests/cmdline/svnmover_tests.py (deleted)
===
Index: subversion/tests/cmdline/svntest/actions.py
===
--- subversion/tests/cmdline/svntest/actions.py (revision 1915144)
+++ subversion/tests/cmdline/svntest/actions.py (working copy)
@@ -432,27 +432,6 @@
   return output


-def run_and_verify_svnmover(expected_stdout, expected_stderr,
-*varargs):
-  """Run svnmover command and check its output"""
-
-  expected_exit = 0
-  if expected_stderr is not None and expected_stderr != []:
-expected_exit = 1
-  return run_and_verify_svnmover2(expected_stdout, expected_stderr,
-  expected_exit, *varargs)
-
-def run_and_verify_svnmover2(expected_stdout, expected_stderr,
- expected_exit, *varargs):
-  """Run svnmover command and check its output and exit code."""
-
-  exit_code, out, err = main.run_svnmover(*varargs)
-  verify.verify_outputs("Unexpected output", out, err,
-expected_stdout, expected_stderr)
-  verify.verify_exit_code("Unexpected return code", exit_code, 
expected_exit)

-  return exit_code, out, err
-
-
 def run_and_verify_svnmucc(expected_stdout, expected_stderr,
*varargs):
   """Run svnmucc command and check its output"""
Index: subversion/tests/cmdline/svntest/main.py
===
--- subversion/tests/cmdline/svntest/main.py(revision 1915144)
+++ subversion/tests/cmdline/svntest/main.py(working copy)
@@ -209,7 +209,6 @@
 svnauthz_validate_binary = os.path.abspath(
 '../../../tools/server-side/svnauthz-validate' + _exe
 )
-svnmover_binary = 
os.path.abspath('../../../tools/dev/svnmover/svnmover' + _exe)


 # Where to find the libtool script created during build
 libtool_script = os.path.abspath('../../../libtool')
@@ -905,12 +904,6 @@
   as list of lines (including line terminators)."""
   return run_command(svnversion_binary, 1, False, *varargs)

-def run_svnmover(*varargs):
-  """Run svnmover with VARARGS, returns exit code as int; stdout, stderr as
-  list of lines (including line terminators)."""
-  return run_command(svnmover_binary, 1, False,
- *(_with_auth(_with_config_dir(varargs
-
 def run_svnmucc(*varargs):
   """Run svnmucc with VARARGS, returns exit code as int; stdout, stderr as
   list of lines (including line terminators).  Use binary mode for 
output."""

@@ -2500,7 +2493,6 @@
   global svnsync_binary
   global svndumpfilter_binary
   global svnversion_binary
-  global svnmover_binary
   global svnmucc_binary
   global svnauthz_binary
   global svnauthz_validate_binary
@@ -2610,7 +2602,6 @@
 svnauthz_binary = os.path.join(options.tools_bin, 'svnauthz' + _exe)
 svnauthz_validate_binary = os.path.join(options.tools_bin,
 'svnauthz-validate' + _exe)
-svnmover_binary = os.path.join(options.tools_bin, 'svnmover' + _exe)

   ##

Index: tools/dev/svnmover/linenoise/LICENSE (deleted)
===
Index: tools/dev/svnmover/linenoise/README.markdown (deleted)
===
Index: tools/dev/svnmover/linenoise/linenoise.c (deleted)
===
Index: tools/dev/svnmover/linenoise/linenoise.h (deleted)
===
Index: tools/dev/svnmover/merge3.c (deleted)
===
Index: tools/dev/svnmover/ra.c (deleted)
===
Index: tools/dev/svnmover/scanlog.c (deleted)
===
Index: tools/dev/svnmover/svnmover.c (deleted)
===
Index: tools/dev/svnmover/svnmover.h (deleted)
===
Index: tools/dev/svnmover/util.c (deleted)
===
]]]

--
- Julian Foad

--
- Matrix me: @julian:foad.me.uk <https://matrix.to/#/@julian:foad.me.uk>
- Follow me: @jul...@fed.foad.me.uk <https://fed.foad.me.uk/julian>


Re: Getting to first release of pristines-on-demand feature (#525).

2022-11-23 Thread Julian Foad
I'm glad to see you all picking up this project again. While working on
this at the beginning of the year I turned on the pristines-on-demand
mode in some of my own WCs such as my 'Documents' tree which includes
lots of scanned paper docs. It works nicely for cases like this, and
feels right, the pristine store being mostly unpopulated when the
working files are mostly unchanging.

I meant to check back with you during the year, how we should take it
forward. The recent summary in this thread sounds about right. My own
capacity to contribute is steadily decreasing. So, thank you, dev
community: it's good to see people working together to make it happen.
It would be pleasing to see this being brought to a satisfactory state
and released.

Nathan, I see you replied enthusiastically and mentioned "I have much to
say on both of these [TODOs] but I won't go into detail yet...". It
seems to me it could be helpful to get that started sooner rather than
later, too, if those issues still need hashing out.

- Julian



SVN-4832 Authz perf regression 1.9 -> 1.10

2022-06-08 Thread Julian Foad
I did some profiling for SVN-4832 "Authz perf regression 1.9 -> 1.10"
.

I updated the issue with my findings, including a sample of the debug
timings output from my debug code.

TL;DR: if an authz file specifies a large number of ACLs and a large
number of users, it is slow. Nearly all the time is used by
svn_authz__parse() calling update_user_rights() this many times:

  (total number of ACLs in the authz file) x (total number of "concrete
users" mentioned in the authz file).

This bottleneck apparently was not present in the previous
implementation (svn 1.9), but I understand other cases were slow then.
The optimization in svn 1.10 apparently optimizes for other cases but
makes things worse in these cases.

As far as I can see the "update_user_rights" code looks functionally
trivial and the problem is just the accumulated time over a huge number
of iterations of it.

I am looking to see if there is anything that can be done about speeding
it up, whether algorithmic or local optimization. If anyone can lend
insight or a hand, I know there are some users who would appreciate it.

It's in subversion/libsvn_repos/authz_parse.c: update_user_rights()
called right at the end of expand_acl_callback(), which is called right
at the end of svn_authz__parse(). In pseudo-code:

  for each of all ACLs:
for each of all users:
  update_user_rights()

- Julian



Re: Subversion 1.10.0 end-of-life

2022-05-07 Thread Julian Foad
Just to add my support: I am glad to see the community adapting the
release policy to suit the current circumstances.


Re: Initial patch for storing the pristines-on-demand setting in the working copy (Issue 4889)

2022-04-21 Thread Julian Foad
Evgeny Kotkov wrote:
> first-cut implementation that persists the pristines-on-demand setting

This is great! Thank you Evgeny.

> The patch currently allows doing an `svn checkout --store-pristines=no`,
> which is going to create a working copy that doesn't store the pristine
> copies of the files and fetches them on demand.  The setting is persisted
> in wc.db.
> 
> 
> The patch doesn't include the following:
> 
> 1) An update for the tests and the test suite to run the tests in both modes.

Ack. An option to choose the mode is needed.

> 2) An update for `svn info` to display the value of the new setting.

Ack. That sounds simple enough.

> 3) An ability to take the --store-pristines value from a user config, perhaps
>   on a per-URL basis.

Ack. That can be simply a yes/no global default for starters. That is a
lower priority (low risk, and not blocking the rest of this).

> While working on the patch, I have stumbled across a couple of issues:
> 
> A) `svn upgrade` without arguments fails for a working copy with
> latest format

It would be nice to change it to a no-op with a friendly message. (We
have already discussed reasons why we are keeping the old format as the 
default.)

> B) Shelving and pristines-on-demand

Indeed shelving is not updated to work with pristines-on-demand. As an
experimental feature probably returning a simple "feature not supported"
error when pristines-on-demand is enabled would be sufficient.

> C) Bumping the related API
> 
>  This part originates from B).  For example, v3 shelving uses the libsvn_wc
>  APIs, such as svn_wc_revert6().  [...] those calls are going to fail [...]
>  [...]
>  Perhaps, we could bump the APIs that currently rely on the text-bases to
>  always be available.  And we could then make their deprecated versions
>  fail (predictably) for working copies that don't store pristine contents.

Any ideas what should the new (bumped) versions do? Just what they do
now, fail if and when pristines are missing? Fetch pristines first? Take
a parameter (or option config struct) telling whether to attempt
hydrating? Something else?

- Julian



Re: Pristines-on-demand: printing progress notifications

2022-04-21 Thread Julian Foad
Daniel Shahaf wrote:
> I'm a bit hesitant about disabling notifications _entirely_ in cat-cmd.c
> and diff-cmd.c.  Disabling all notifications (as opposed to only
> hydration-related notifications which we focus on right now) seems like
> it could easily have unintended consequences.  Do we do that elsewhere?

Thanks for the review and discussion.

In "svn" we set the callback to null in a few other places, not many:

  * if (opt_state.quiet)  # all subcommands
  * if (opt_state.xml)  # all subcommands
  * svn propedit  # /* We do our own notifications */
  * svn changelist --quiet  # redundant with "opt_state.quiet"?
  * svn x-shelve/x-unshelve --quiet  # redundant again?

Maybe doing that for "diff" is suppressing too much, as you suggest.

I couldn't previously think of other notifications we might use in
"diff" but now I can imagine things such as "now processing an external
at path X" (now (I haven't checked) or in the future).

It would be best to preserve the possibility of any such notifications
during "diff".

The same applies to "cat", logically, the only difference being a
difference of expectation (we don't expect any other progress
notifications to be printed for "cat" anyway, while for "diff" we might).

Done: in "diff" and "cat", suppress just these particular notifications:

  https://svn.apache.org/r1900110


The rest of my reply (below) is just discussion.

> «diff» isn't unique in having parseable output. [...] I am leaning to
> the opinion that «svn diff» isn't a special case, and «diff» and those
> other commands should all use the same rules for their output.
> [...]
> What are those rules?  Good question.  stdout isn't a good place for the
> hydration notifications, since those are of interest to the (human)
> producer of the output but not to the consumer of the output; [...]
>  
> «svn cat», on the other hand, _is_ unique, in that it's the only command
> whose output format is not controlled by us. [...]

Let me first summarise the logic that led me to the current implementation.
Diff and cat are (presently) the only two data-outputting commands that
may hydrate; the rest are action commands. The output of action commands
is generally feedback to the operator of the command, and adding more
forms of notification to their output is a natural fit.

For the data-outputting commands, I preferred to avoid changing the
output as it would potentially have more consequences on its consumers
than we are willing to deal with. Obviously "cat" is particularly
sensitive, as you point out, and we agree it should print no feedback.

That leaves "diff", which as you point out is categorically similar to
other data outputting commands. A particularity of diff format is it
explicitly allows additional output which parsers should ignore. In that
respect it would be acceptable to add new notifications in its stdout,
even more so than for the other commands whose output we control. But
the issue remains that progress feedback logically belongs to the
operator of the command and not to the consumer of its output (although
the two are often the same). It is reasonably for a consumer of the
output to expect that the output will be repeatable for the same logical
WC state, and will not change depending on the caching state of
Subversion's metadata area. So adding feedback to data-outputting
commands, mixed in with their output, is not a great idea.

Ideally progress feedback would go neither to stdout nor to stderr but
to a third stream. Technically we could implement this easily as an
OS-level stream, but while it's the obvious approach in a GUI, I don't
think the idea of a separate feedback stream is widely used in
command-line environments so it would not be readily understandable and
accessible to users.

Similarly, we could add options to control whether and where such
feedback goes. But at this stage I don't think that added complexity is
warranted, at least not as an add-on handling only this specific case.
Maybe if we were to re-think the command line interface in its entirety,
say as a "2.0" project, that's where it could make sense.

For now it doesn't seem to me that any further changes are needed at
this time, other than the change agreed above to the conditional logic
for the feedback suppression.

- Julian



Re: Question on release announcement mail

2022-04-10 Thread Julian Foad
Mark Phippard wrote:
>Looking at past release announcements, they include a section on who
>signed the release that looks like this: [...]
>I am kind of at a loss for how to produce this information. [...]
There's a 'release.py' subcommand that writes this email for you, if I recall 
correctly.
- Julian


Re: Pristines-on-demand: fix disabled tests (#4891)

2022-04-07 Thread Julian Foad
Johan Corveleyn wrote:
>Should be fixed now, with r1899654.

Great! Thanks.

>I suppose not passing the option uses wc-format-version=1.15

No...

>automatically? Or what is the default? Is format 1.8 the same as 1.14
>and anything in between?

Default would be the format '31' which is 1.8 through 1.14.

- Julian


Re: Pristines-on-demand: OK to merge to trunk?

2022-04-07 Thread Julian Foad
Nathan Hartman wrote:
> The branch worked for me when I last tested it and I saw no glaring
> issues so I have no objections to merging it soon. That said, I would
> encourage, if at all feasible, that we try to do two things first:
> decouple the format 32 and pod525 feature, and decide what the
> user-facing feature and its CLI switches should be called so that our
> internal "plumbing" names won't stick forever as the user-visible
> terms... If we can only do one of these, it should probably be the naming.

For "decouple the format 32 and pod525 feature", see email thread
"Pristines-on-demand=enabled == format 32?" 
.

Naming
==

What naming do we want? I'm perhaps too close to the code. Some
potentially good ideas have been floated in the past, none very
obviously best but perhaps we can pick a good one from them. Some
suggestions were along the lines of:

  * "bare WC"
  * "blob-optimised WC"
  * "[no] local cache" (of text-bases)
  * "[no] trade space for speed"

(just off the top of my head).

The feature name hardly appears in the UI, so far. The places I can
think of are:

  * a progress notification: "Fetching text bases...",
  * perhaps a dedicated option name for checkout/upgrade, as being
discussed in the "decouple" thread mentioned above;
  * release notes;
  * user guide (where currently a place-holder 'POD525' is used for the
feature name).

- Julian



Re: Pristines-on-demand=enabled == format 32?

2022-04-07 Thread Julian Foad
Johan Corveleyn wrote:
> Ah, yes, I think that makes #4889 a blocker.

Well, I'm having a hard time deciding what exactly we need and why.

I previously said "it's pretty clear it needs to be uncoupled" but
actually just now I've dived into it for a couple of hours, coding and
thinking, and it's not at all clear what this means.

Is it mainly about UI level naming in "checkout" and "upgrade" -- in
other words, that we would prefer the user to say

  "svn checkout --enable-POD525"

instead of (or in addition to) "--compatible-version=1.15"? And we would
not need to support the combination of "=1.15" without "--enable-POD525"
(in other words the feature is still coupled to the format in the
implementation), but require it to be specified explicitly and error out
if it isn't, in order to set a precedent that that's the option you'll
also be needing to use in future versions?

('POD525' used here as a place-holder for the feature name that is to be 
decided.)

If it's that kind of UI-level naming issue, then we probably want to
also put corresponding entry in "svn info" saying "POD525 enabled?
yes/no". And anywhere else the idea is exposed in the UI, if anywhere.

And/or, is it that we want to put internal APIs in place that let the
higher level code layers ask "is POD525 enabled?" and not have to change
those callers when 1.16/new-wc-format-33 comes around? But I don't see a
strong need for that. We are not making a strong case for any of this to
be exposed in public APIs at this time at all and the internal API calls
can surely be updated as and when needed.

And/or, is it that we really need users to be able to create a format-32
(v=1.15) WC that doesn't have POD525 enabled? I am pretty sure that is
not the case.

Am I missing some other need? What seems to be the problem really?


> I tried to suggest a slightly more flexible per-WC-config [...]

I appreciate your suggestions. I think you are on the right track for
forward-thinking and architecturally sound design. It's something we can
take into account when naming the parts and designing the config options.

- Julian



Pristines-on-demand: OK to merge to trunk?

2022-04-07 Thread Julian Foad
TL;DR: are we OK to merge the pristines feature
('pristines-on-demand-on-mwf' branch) to trunk soon, like early next week?

As said in "A status review" [1] in the long thread "A two-part vision
for Subversion and large binary objects.", next steps are reviewing and
handling the outstanding issues, and proposing merge to trunk. I think
these can be done in parallel as I don't see any that would block a
merge to trunk. So here is the proposal to merge to trunk, and then
complete the remaining work on trunk.

It feels to me like there is general consensus that this feature is
taking a form that will be acceptable for a first release of it (while
not perfect), and consensus for proceeding to get it into trunk and
subsequently including it in the next release. I'm too close to it to
make an independent assessment. Can anybody else comment?

If no objections, I plan to merge to trunk early next week.

[1] on dev@, 2022-04-05, 
https://lists.apache.org/thread/lm98og8jqonffcs250q5y3ft5r5qlmk5



Re: A two-part vision for Subversion and large binary objects.

2022-04-07 Thread Julian Foad
Julian Foad wrote:
> Pristines (#525):
>  - #4888 authz denied during textbase sync
>(an edge case issue, not sure if it's a blocker)
>  - #4889 per-WC config
>(wanted)
>  - #4891 fix disabled tests
>(a few different edge cases; much of the analysis is posted in the issue)
> 
> Getting multi-wc-format ready for release (#4883):
>  - #4885 WC upgraded and not-upgraded notifications
>(still open for some nice-to-haves, but probably done enough for MVP)
>  - #4886 config for default WC version for checkout & upgrade
>()
>  - #4887 clarify/unify option names for compatible-version
>(perhaps change '--compatible-version' to '--wc-compatible-version'
> or '--min-compatible-client')
>  - API review; thread: "multi-wc-format review"
>(state is APIs are mostly private and a bit messy; not clear what,
> if anything, we would want to change)

Further updates:

#4888: demoted to non-blocker
#4889: blocker, in progress
#4891: blocker, in progress (I've processed a bunch of the sub-issues in it)
#4885: done enough (now non-blocker)
#4886: not sure (currently marked non-blocker)
#4887: not sure (currently marked blocker)
API review: not sure
Merge to trunk: new thread "Pristines-on-demand: OK to merge to trunk?"

In #4891 "fix disabled tests", the remaining sub-issues don't look like
show-stoppers. Likely we will soon demote it to non-blocker.

Issue #4889 "per-WC config" is the subject of Johan's new dev@ post
"Pristines-on-demand=enabled == format 32?". We already concurred that
it's wise to decouple "pristines-on-demand mode is enabled in this WC"
from "the WC format is (at least) 32 so can support that mode".
<https://subversion.apache.org/issue/4889>. This may be considered
higher priority than fixing the remaining tests. I previously drafted a
proof-of-concept for such a config setting. I'm going to spend two or
three hours and see if I can complete an acceptable minimal version of it.

This (#4889) conceptually also relates to #4886 "config for default WC
version for checkout & upgrade"; I am not yet sure if both are
separately necessary.

Two other issues Karl and I discussed were:

* regression tests:
  -> current status is devs need to run test suite both with and without
the new '--wc-format-version=1.15' knob;
  -> this adds another knob to the several existing knobs;
  -> the resulting exponential increase in test runs is a concern but
not a new problem in itself;
  -> we should make build bots run that combination.
  -> Filed as #4898 "Pristines-on-demand: make buildbots test it"

* simplified user documentation:
  -> not sure, maybe existing is sufficient initially (just needs to be
put where users can find it?);
  -> maybe someone else will be able to rewrite into a simpler, more
digestible form?



Re: Pristines-on-demand: fix disabled tests (#4891)

2022-04-07 Thread Julian Foad
Johan, that's a great start, thanks.

Now can you run the test suite with --wc-format-version=1.15? On Unixy
systems that is done by, for example:

$ make svnserveautocheck WC_FORMAT_VERSION=1.15 ...

Jun Omae wrote:
>> FAIL:  diff_tests.py 48: svn diff --diff-cmd provides the correct arguments
> 
> I've posted patch for the failure.
> See https://lists.apache.org/thread/2o0xtqfzy9xg8wzxscj2wb641p2kyo9c

Thank you, Jun Omae. Sorry, I missed that before. Committed now in r1899645.

- Julian



Re: Pristines-on-demand: fix disabled tests (#4891)

2022-04-06 Thread Julian Foad
Today I have done a quick review/fix/postpone cycle on the 'upgrade',
'update', and 'trans' tests. Details in
https://subversion.apache.org/issue/4891 .

- Julian



Re: Pristines-on-demand: fix disabled tests (#4891)

2022-04-06 Thread Julian Foad
Johan Corveleyn wrote:
>> A few of these are Windows-specific. I can't very well investigate those
>> myself. Who could volunteer to look at those? They are:
>> 
>> externals_tests.py ... ... ...:
>>   update_modify_file_external(),
>>   remap_file_external_with_prop_del(),
>>   file_external_recorded_info():
>> existing issue (Windows only)
>> 
>> These tests are commented with "# Existing issue: `src_stream` not
>> closed in externals.c:apply_textdelta()"
>> 
>> Does that mean it's a problem already on trunk? Or a hidden problem? If
>> someone could please take a look, that would be great. If not, could
>> anyone volunteer to test a possible fix if one of us unixers made a guess?
> 
> I have no idea what those tests are about, but I can probably help
> test things on Windows.
> I'll try to get the branch built on my Windows machine tonight, and
> then I can help try out various things.

Great! Thanks.

I found also this one is a Windows-only test:

update_tests.py 57 skip_access_denied(): access denied paths should
be skipped

(In principle I don't see why we shouldn't test a similar access-denied
condition on unix-like systems, though the error code would be
different. Maybe the answer is the semantics would be so different it
would fail at a different place in the code path and so be a different
test case. I don't plan to pursue this.)

I think the current state is this test will report XFAIL/WIMP with
pristines-on-demand enabled (currently meaning WC format 32) and XPASS 
otherwise.

This test was introduced in r1143071 with the log message "Make svn
update handle some access denied scenarios with a proper skip. This
makes it less likely that the database will be closed after updating with
working queue items left."

I would guess that gracefully handling the case where the working file
on disk has this particular kind of OS lock isn't of primary concern.

- Julian



Re: Pristines-on-demand=enabled == format 32?

2022-04-06 Thread Julian Foad
Johan Corveleyn wrote:
> I think this was asked several times before, but I can't find the
> thread: is the pristines-on-demand behavior still unconditionally tied
> to format 32? Or is it that format 32 makes it _possible_ to enable
> pristines-on-demand?

Currently it's tied to f32, but it's pretty clear it needs to be
uncoupled. The issue is:

https://subversion.apache.org/issue/4889 "Pristines-on-demand: per-WC config"

In principle we could address this later, only when a further new
version is released with a further new format and a further new feature
that users need to have available without enabling pristines-on-demand;
but it seems more responsible to uncouple it before we release it.

I think that means #4889 is the other blocker issue. (Does anyone see
any way around it?)

- Julian



Pristines-on-demand: fix disabled tests (#4891)

2022-04-06 Thread Julian Foad
One of the main things blocking pristines-on-demand now is a few edge
case test failures.

It would be great if we could get some further eyes on these.

On the branch, these failures have been converted to XFAIL by adding
'@Wimp' decorators.

Some of them seem to fail even when run against the old WC format,
presumably due to changes that had to be made in libsvn_wc. I suppose
these are the most important to fix, to avoid regressions in existing
behaviour, whereas edge cases in the new pristines-on-demand mode can be
prioritised lower.

One ticket summarises all these test failures (except the auth-denied
case which is #4888):

  https://subversion.apache.org/issue/4891

Could anyone help review these test problems to see which might be
blockers and need fixing?


=== Windows-specific issues

A few of these are Windows-specific. I can't very well investigate those
myself. Who could volunteer to look at those? They are:

externals_tests.py ... ... ...:
  update_modify_file_external(),
  remap_file_external_with_prop_del(),
  file_external_recorded_info():
existing issue (Windows only)

These tests are commented with "# Existing issue: `src_stream` not
closed in externals.c:apply_textdelta()"

Does that mean it's a problem already on trunk? Or a hidden problem? If
someone could please take a look, that would be great. If not, could
anyone volunteer to test a possible fix if one of us unixers made a guess?


=== The other issues

basic_tests.py 8 basic_commit_corruption
basic_tests.py 9 basic_update_corruption
revert_tests.py 2 revert_reexpand_keyword
trans_tests.py 1 keywords_from_birth
trans_tests.py 3 eol_change_is_text_mod
update_tests.py 57 skip_access_denied
upgrade_tests.py 16 replaced_files
upgrade_tests.py 7 basic_upgrade_1_0

(I obtained the list by diffing the tests on trunk and the branch,
looking for addition of "@Wimp" and similar on the branch tests.)

Daniel, you already investigated and commented on some of them. Might
you have some time to look further? If so, may I propose you start at
the beginning of this list, as you have already added insightful
comments/tests/investigations about some of those.

I will now start at the end of that list and work backwards. I will
spend just a short time on each working out whether it's an easy fix,
fixing if so, and postponing if not. So that we meet somewhere in the
middle and don't overlap, I will post each time I work through a test or
small group of similar tests (such as the two 'trans' tests).


=== Running a test case

Quick reminder on running a test case with a specific WC format:

$ (SRC_DIR=...; cd obj-dir/subversion/tests/cmdline/ && \
python3 $SRC_DIR/subversion/tests/cmdline/basic_tests.py 8 \
--bin=$SRC_DIR/bin \
--config-file=$SRC_DIR/subversion/tests/tests.conf \
--wc-format-version=1.15 )

Remember to test also with --wc-format-version=1.8.

Thanks,

- Julian




Re: Pristines-on-demand: authz denied during textbase sync (#4888)

2022-04-06 Thread Julian Foad
Evgeny Kotkov wrote:
>> - revert the patch I applied, as it's papering over the problem in an
>> incomplete way and so possibly causes more confusion than it fixes.
>> 
>> - leave this issue open and come back to it later; it's an edge case not
>> part of common work flows.
> 
> +1.

Thanks! Reverted that patch in r1899616.

- Julian



Re: Pristines-on-demand: authz denied during textbase sync (#4888)

2022-04-06 Thread Julian Foad
> Filed as issue #4888, https://subversion.apache.org/issue/4888

I have just been looking back over this issue. Clearly there is more to
it than a quick fix. Summary, based on reviewing the email thread:

- FAIL: authz_tests.py 31 remove_access_after_commit

- Patched in : in "text base sync"
phase, ignore auth error while fetching any text base; continue with
trying to fetch the rest.

- While that change fixes that particular test case, it also seems to
have regressed the failure mode of «svn cat iota@BASE» when iota is
locally-modified and has no read access. On trunk that command displayed
the base text; on this branch that command now errors out.

- We started thinking about what other failure modes could now occur
because of failures (including unauthorised, redirect, and others) at
the hydrate phase. This is somewhat open ended; we don't have a simple
answer to how this all should be taken care of consistently.

I suggest:

- revert the patch I applied, as it's papering over the problem in an
incomplete way and so possibly causes more confusion than it fixes.

- leave this issue open and come back to it later; it's an edge case not
part of common work flows.



Re: A two-part vision for Subversion and large binary objects.

2022-04-05 Thread Julian Foad
A status review from my P.O.V.


Pristines (#525):

* issues filed (potential blockers):

  - #4888 authz denied during textbase sync
(an edge case issue, not sure if it's a blocker)
  - #4889 per-WC config
(wanted)
  - #4891 fix disabled tests
(a few different edge cases; much of the analysis is posted in the issue)

* next milestone: merge to trunk

  - Are any of the issues blockers for merge to trunk? I would suggest not.


Getting multi-wc-format ready for release (#4883):

* issues filed (potential blockers):

  - #4885 WC upgraded and not-upgraded notifications
(still open for some nice-to-haves, but probably done enough for MVP)
  - #4886 config for default WC version for checkout & upgrade
()
  - #4887 clarify/unify option names for compatible-version
(perhaps change '--compatible-version' to '--wc-compatible-version'
or '--min-compatible-client')

* not filed:

  - API review; thread: "multi-wc-format review"
(state is APIs are mostly private and a bit messy; not clear what,
if anything, we would want to change)


Release notes are drafted:




Testing:

Some testing by devs has been done of both multi-wc-format and
pristines-on-demand. It seems to be generally in good shape; no glaring
issues found.

Next steps I suggest:

  - propose merge to trunk
  - review the issues mentioned: fix or decide to postpone each

- Julian



Re: Should tarballs contain __pycache__ dirs?

2022-04-04 Thread Julian Foad
Yasuhito FUTATSUKI wrote:
>> ./build/__pycache__
>> ./build/generator/__pycache__
>> ./build/generator/swig/__pycache__
>> 
>> Are they supposed to be present in the tarball?
> 
> I don't think they are supposed.

Agreed.

> Those directries should be cleaned in fast-clean and clean-swig-py
> target in Makefile.in, like "*.pyc" files.
> 
> Here is a patch against trunk, but can't be applied against 1.14.x.

I haven't tested the patch but +1 in principle.

- Julian



Re: Website prep work and questions (mainly about the 1.15 release notes)

2022-04-04 Thread Julian Foad
Mark Phippard wrote:
> Julian Foad wrote:
>> Me too. It appears I need to update my configured keyserver. Then
>> maybe fetch keys and then maybe the checking will work. That's based
>> on, so far, finding that checking existing keys fails due to
>> unreachable key server, and then reading [...]

OK, unpicking what I was doing: I started from a script I wrote years
ago that goes through the steps of fetching, testing, signing etc. It
checks existing sigs before signing my sig. It was erroring out at the
checking your sig (which was the only sig in the releases' *.asc files,
at that time):

$ tools/dist/release.py --target=. check-sigs 1.10.8
INFO:root:Checking 1 sig(s) in ./subversion-1.10.8.tar.bz2.asc
BAD SIGNATURE: Key 1 in ./subversion-1.10.8.tar.bz2.asc
  key id: C4416167349A3BCB
ERROR!

So I first tried refreshing my local GPG keyring, and hit the obsolete
keyserver problem I mentioned. I fixed that by changing the keyserver
entry in my ~/.gnupg/gpg.conf to say "keyserver
hkp://keyserver.ubuntu.com". Then "gpg --refresh-keys" succesfully
fetched updates including your new key.

Then the checking got further but still errored out:

$ tools/dist/release.py --target=. check-sigs 1.10.8
INFO:root:Checking 1 sig(s) in ./subversion-1.10.8.tar.bz2.asc
INFO:root:Checking 1 sig(s) in ./subversion-1.10.8.tar.gz.asc
INFO:root:Checking 1 sig(s) in ./subversion-1.10.8.zip.asc
Traceback (most recent call last):
  File "/home/julianfoad/src/subversion-c/tools/dist/release.py", line
1916, in 
main()
  File "/home/julianfoad/src/subversion-c/tools/dist/release.py", line
1912, in main
args.func(args)
  File "/home/julianfoad/src/subversion-c/tools/dist/release.py", line
1458, in check_sigs
output = get_siginfo(args)
  File "/home/julianfoad/src/subversion-c/tools/dist/release.py", line
1420, in get_siginfo
formatter = PUBLIC_KEY_ALGORITHMS[keytype]
KeyError: 22

I took a quick look at the source but couldn't quickly figure it out. I
see now (quoted below) that your new key is a newer EC type; maybe that
is why. But keytype 22 isn't listed in the table referenced from a
source code comment and I don't know where to turn next. 

Then I realised that checking existing keys need not block me from
adding my own, and so I proceeded with that.

> [...]
> 1. The KEYS file is from the script that was shared.
> 2. I had to create a new GPG key. I noticed it gave me one of the
> newer elliptic curve keys. Maybe not all versions of OpenPGP can
> handle these?
> 3. I uploaded it to the MIT keyserver as per something I read in the
> ASF committer docs ...
> Actually looking at history I did this:  gpg --send-key
> EC25FCC105618D04ADB43429C4416167349A3BCB
> 4. I updated my fingerprint in ASF LDAP
> 
> Since I just created this key a couple weeks ago if it is better that
> I generate a new key, re-sign the release and upload new signatures
> just let me know what to do.

At this point I don't have reason to suspect your key is bad, more
likely the scripting and/or my setup needs some update. But I don't
expect to spend more effort on this unless it becomes a blocker for something.

Thanks,
- Julian



Re: Subversion 1.14.2 up for testing/signing

2022-04-03 Thread Julian Foad
Passed and signed.

Test environment: Unix (Ubuntu 21.10)

Tested:
  (serf, svn, local) x (fsfs)
  bindings: javahl, swig-pl, swig-rb

Not tested:
  bdb, swig-py, ctypes-python

Dependency versions (Ubuntu distribution-supplied packages):
  SQLite (from source at: /usr)
  libapr1 1.7.0-6ubuntu1
  libaprutil1 1.6.1-5ubuntu2
  zlib1g 1:1.2.11.dfsg-2ubuntu7.1
  zlib1g 1:1.2.11.dfsg-2ubuntu7
  openssl 1.1.1l-1ubuntu1.2
  openssl 1.1.1l-1ubuntu1
  apache2 2.4.48-3.1ubuntu3.3
  apache2 2.4.48-3.1ubuntu3
  libdb-dev 1:5.3.21~exp1ubuntu3
  perl 5.32.1-3ubuntu3
  ruby 1:2.7+2build1
  junit 3.8.2-9
  junit4 4.13.1-2
  junit4-doc 4.13.1-2
  junit-doc 3.8.2-9
  junit5 5.3.2-4
  junitparser 2.0.0-1
  swig 4.0.2-1ubuntu1
  libserf-dev 1.3.9-10

- Julian



Re: Subversion 1.10.8 up for testing.signing

2022-04-03 Thread Julian Foad
Passed and signed.

Test environment: Unix (Ubuntu 21.10)

Tested:
  (serf, svn, local) x (fsfs)
  bindings: javahl, swig-pl, swig-rb

Not tested:
  bdb, swig-py, ctypes-python

Dependency versions (Ubuntu distribution-supplied packages):
  SQLite (from source at: /usr)
  libapr1 1.7.0-6ubuntu1
  libaprutil1 1.6.1-5ubuntu2
  zlib1g 1:1.2.11.dfsg-2ubuntu7.1
  zlib1g 1:1.2.11.dfsg-2ubuntu7
  openssl 1.1.1l-1ubuntu1.2
  openssl 1.1.1l-1ubuntu1
  apache2 2.4.48-3.1ubuntu3.3
  apache2 2.4.48-3.1ubuntu3
  libdb-dev 1:5.3.21~exp1ubuntu3
  perl 5.32.1-3ubuntu3
  ruby 1:2.7+2build1
  junit 3.8.2-9
  junit4 4.13.1-2
  junit4-doc 4.13.1-2
  junit-doc 3.8.2-9
  junit5 5.3.2-4
  junitparser 2.0.0-1
  swig 4.0.2-1ubuntu1
  libserf-dev 1.3.9-10

- Julian



Re: Website prep work and questions (mainly about the 1.15 release notes)

2022-04-03 Thread Julian Foad
> I'm in the middle of the
> process of testing, however I have some trouble with the gpg keys [...]

Me too. It appears I need to update my configured keyserver. Then maybe fetch 
keys and then maybe the checking will work. That's based on, so far, finding 
that checking existing keys fails due to unreachable key server, and then 
reading 

- Julian


Re: Pristines-on-demand: printing progress notifications

2022-03-30 Thread Julian Foad
Karl Fogel wrote:
> I think printing these messages to stderr makes the most sense. 
> There are plenty of programs out there that parse the stdout of 
> 'svn'; we don't want to interfere with them.
> 
> As you point out, it's especially important for 'svn diff' and 
> 'svn cat' that stdout remain untainted.  Therefore, we can either 
> always print these messages to stderr (across all commands), or 
> not print them for 'svn diff' and 'svn cat' (but that would be an 
> odd inconsistency).
> 
>> Anybody want to recommend what we should do for 'cat' and 'diff'?
> 
> As per above: I think we should print the messages to stderr for 
> everything, including those two.

Printing progress notifications for data-output commands (diff, cat) to
stderr does however invite bikeshedding. Currently in our test suite we
assume any stderr output (from diff, cat) should be flagged as a test
failure. We can change that, but it indicates that some users may
consider it a failure too. We would need to agree and decide whether
that's going to be unconditional or if the user needs to be able to turn
it off for convenience and for backward compatibility.

Because this could be dragged out I'm filing it as a lower priority for
now. We can get back to it. (If someone feels able to resolve it, great.)

- Julian



Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
Finished and committed: https://svn.apache.org/r1899173 .

This looks and feels much better to me now.

Example:

$ svn revert -R subversion/
Fetching text bases ..done
Reverted 'subversion/svn/cat-cmd.c'
Reverted 'subversion/svn/diff-cmd.c'
Reverted [...]

- Julian



Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
Updated and cleaned-up patch 'hydrating-notifications-2.patch' attached,
for interest. Still TODO: update test expectations.

hydrating-notifications-2.patch
Description: Binary data


Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
Daniel Sahlberg wrote:
> How does svn cat handle any other informative or warning messages?

I don't think 'svn cat' prints any other informative or warning messages.

It occurs to me now that anything on stderr is generally assumed to
indicate an error, in tests in the test suite. We need to update test
expectations anyway (about 30 tests fail when I add the notification to
the other commands); this is just an additional point.


Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
For 'svn diff' especially, if we don't print the notifications, then we
miss out on informing the user during one of the times when it could be
particularly valuable to them. (They are waiting for diff output, which
previously in svn used to come quickly.)

It would be ugly to print these notifications in the stdout stream.
Diff output format rules do allow it to contain new informative lines
outside the diff hunks, but this kind of notification would not be
useful or nice for consumers of the diff.

Print progress notifications to stderr? That is what some other
command-line tools do. Some even have command-line flags to control
where it is sent. One precedent we have is warnings, e.g. in "svn
proplist nonexistent-target valid-target" a warning about
'nonexistent-target' goes to stderr while the primary output for
'valid-target' goes to stdout. I am not sure how I feel about printing
them to stderr. I could be persuaded either way (print them to stderr or
not at all).

Anybody want to recommend what we should do for 'cat' and 'diff'?

For the time being I am going to make 'svn' suppress hydrating
notifications for 'cat' and 'diff'. We will at least get the benefit for
the other subcommands.

This is only a concern for command-line clients. GUIs have other ways to
separate the progress notifications from the primary output. Therefore
this is all a question of what to do in the client layer, the 'svn'
executable; the library will continue to call the notification call-back
if the client supplies one.

- Julian



Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
I made a mistake in:

> 'svn cat' already suppresses these notifications.

CORRECTION:

'svn cat' currently prints these notifications.
'svn diff' currently prints these notifications.
--> TODO: stop printing them here.

It seems the only change needed here is to make both 'svn diff' AND 'svn
cat' suppress those notifications.



Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
Daniel Sahlberg wrote:
>> I'll put it on my todo list, but I can't promise when I find time to
>> to that.

I only meant to ask you to clarify what you meant: whether you are
reporting for TSVN 1.14 or a TSVN 1.15-dev build (based on the
pristines-on-demand branch or otherwise; I don't know if anyone has made
any such builds); and whether you meant it was reporting bytes-progress
specifically on hydrating or just on *some* data transfers.

* * * *

About '--quiet' and when to print these notifications (in the
command-line client 'svn').

Commands whose primary purpose is to output data probably should not
display the notifications.

(If they accept flags like '--quiet' or '--verbose', those are for
adjusting their primary information output, not progress notifications.
We historically do not have any debug-level verbosity controls on
commands like 'diff' and 'cat'. I don't think we need to add such
controls before or as part of adding these notifications to the other commands.)

'svn cat' already suppresses these notifications.
'svn diff' currently prints these notifications.
  --> TODO: stop printing them here.

(Those are the only data-output commands that may hydrate.)

The other commands probably should print the notifications by default
and suppress them when '--quiet' is added.

'svn revert --quiet' already suppresses those notifications.
'svn co/up/sw --quiet' already suppresses those notifications.
'svn resolve[d] --quiet' already suppresses those notifications.
Conflict resolver in 'svn merge/up/sw --quiet' already suppresses those 
notifications.

It seems the only change needed here is to make 'svn diff' suppress
those notifications.

- Julian



Re: Pristines-on-demand: printing progress notifications

2022-03-24 Thread Julian Foad
Marl Phippard wrote:
> Yes, and I agree --quiet should suppress. [...TSVN] hooked into our
> notifications to be providing an update on bytes transferred. [...] I
> just assumed they would work ... maybe they still do?

Daniel Sahlberg wrote:
>> Yes, I like this very much. The more feedback [...] the less
>> risk/chance of users believing something is wrong. --quiet also seems
>> like a good idea, then powerusers can limit the amount of feedback.

OK, seems I should proceed. Thanks for the comments.

>> I'm primarily using TortoiseSVN and I can confirm that it is
>> displaying the number of bytes trasfered, but I havn't digged through
>> the code to find where these notifications come from.

Daniel, can you confirm if you mean you are using a 1.14-based TSVN, on
a 1.15 working copy, and it is showing bytes-transferred feedback during
the hydrating? I had assumed it would need to be rebuilt and modified to
do that. If that part Just Works already, that's great news.

>> I would want to see the "hydrating" message also in TSVN, but I
>> assume it would not be too difficult to hook into these notifications
>> (if they are indeed a different kind).

Certainly TSVN would need a small update to make it display the new
notifications (hydrating start, hydrating file, hydrating end).

- Julian



Re: Pristines-on-demand: printing progress notifications

2022-03-23 Thread Julian Foad
The notifications are printed for any command. Let's please pretend I
didn't use 'update' as an example. I don't want us to mix up this
discussion with the discussion about how 'update' shouldn't hydrate.
Let's pretend I used 'diff' or 'revert' as the example, like this:

[[[
$ svn revert -R 1991/*.pdf 1994/
DBG: textbase.c:  75: ### open a new session for hydrating
  Hydrating text bases .
  Hydrated 1 text bases
Reverted '1991/one.pdf'
DBG: textbase.c:  75: ### open a new session for hydrating
  Hydrating text bases .
  Hydrated 1 text bases
Reverted '1991/two.pdf'
DBG: textbase.c:  75: ### open a new session for hydrating
  Hydrating text bases 
  Hydrated 4 text bases
Reverted '1994/a.pdf'
Reverted '1994/b.pdf'
Reverted '1994/c.pdf'
Reverted '1994/d.pdf'
]]]

Also, in the context of this patch I would remove the 'DBG' debug
prints. We can discuss optimising the opening of RA sessions in another thread.

- Julian



Pristines-on-demand: printing progress notifications

2022-03-23 Thread Julian Foad
I thought it would be useful to let users know why Subversion is pausing
for a long time where previously it would not have done. I think if I
were that user, I would want to know. So I tried the attached patch,
printing notifications when it is fetching text-bases.

The result looks like this.

[[[
$ command svn up 1991/*.pdf 1994/
Updating '1991/one.pdf':
DBG: textbase.c:  75: ### open a new session for hydrating
  Hydrating text bases .
  Hydrated 1 text bases
At revision 2231.
Updating '1991/two.pdf':
DBG: textbase.c:  75: ### open a new session for hydrating
  Hydrating text bases .
  Hydrated 1 text bases
At revision 2231.
Updating '1994':
DBG: textbase.c:  75: ### open a new session for hydrating
  Hydrating text bases 
  Hydrated 4 text bases
At revision 2231.
Summary of updates: [...]
]]]

The "Hydrated N text bases" are redundant: N equals the number of dots
printed in the previous line. This is just for demonstration. So are the
indentations and the DBG lines.

I thought maybe we would like to show this detail by default, and
suppress it when '--quiet' is passed. (Not implemented in this demo patch.)

I was also mildly surprised to see that the fetches are not necessarily
all grouped together at the start of a high-level operation. Clearly
it's doing the 'hydrate' once per explicit target. Actually I remember
seeing this in earlier review, then forgetting as it didn't seem
important. I don't know if we care. I suppose it is unavoidable that a
client's operations might sometimes be grouped like this. Even if the
client library API allows the client to pass all targets in one call, a
client in general may choose to call the library function once per target.

If I were the user, and particularly if I were in a use case where just
one huge file is involved, then I would also like to have feedback on
the (percentage) feedback through the file fetch. It doesn't look like
we have a good way to do that at the moment, but please correct me if
I'm wrong.

Do we like this?

If so, I can clean up the patch a bit and commit it.

- Julian


hydrating-notifications-1.patch
Description: Binary data


Re: multi-wc-format: upgrading externals

2022-03-23 Thread Julian Foad
Daniel Shahaf wrote:
> [...] Isn't this orthogonal to the
> multi-wc-format work?  To date it has always been possible to upgrade an
> external without upgrading its parent [...]  The
> fact that the client doing the upgrade has multi-wc-format support
> doesn't affect this logic.
> 
> This would argue towards leaving SVN-4890 open, but making it not block
> SVN-4883.

Good call. I changed this issue to not be a blocker for SVN-4883
(multi-wc-format).


Re: multi-wc-format: svn_wc__format_from_version()

2022-03-17 Thread Julian Foad
Daniel Shahaf wrote:
> It doesn't seem to be an old function; the docstring says (in a part I 
> snipped) it's new in 1.15.  However, I don't think that changes the answer.  
> Done in r1899004.

Oh, just older than my involvement then! Thanks.


- Julian


Re: multi-wc-format: svn_wc__format_from_version()

2022-03-17 Thread Julian Foad
That's an old function. "Characteristic" previously meant the only format 
supported by a given client version. We should change the word now. What should 
the function return now? The newest, I think: its callers are upgrade and 
checkout; essentially it is used to implement the --bikeshed=1.15 option.
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-17 Thread Julian Foad
Johan Corveleyn wrote:
>It's not specific to 'svn update' per se, but it's logical that it
>leads to this discussion, because it is a (commonly used) case where
>the pristine is not actually needed for the operation (if there is no
>actual incoming update to the concerned file). 'svn diff' and 'svn
>revert' cannot do their work without the pristine, but 'update without
>an actual incoming edit'?
>
>And even with an 'incoming edit on update (on top of local mod)' it
>might in theory be possible to delay the download of the full pristine
>until after conflict-resolution decision (but I imagine that's even
>more difficult to untangle).
>
>Also, why should 'svn update' be in the business of (silently)
>restoring "the branch's invariant" (even when it does not need the
>file), and not any other operation (like 'svn status -u' for example)?

Thanks for adding all this rationale. +1 to it all. 100% agreed.


- Julian


Re: multi-wc-format review

2022-03-17 Thread Julian Foad
Daniel Shahaf wrote:
> + The upgraded working copy will be compatible with Subversion 1.8 and
> + newer (this default may change ...

Sure, +1, a bit clearer.

Also see Nathan's option-naming proposal at the end of this message.

>> Which WC format did our hypothetical user want? (Rhetorical.) The
>> current implementation gives them the highest format compatible with the
>> requested version. But contrast that to our planned behaviour of 1.15
>> which is to select the older format (31) by default. To me, that is
>> already showing the cracks in this approach.
>  
> In the described scenario, the user wanted any format that 1.15 can
> write to.  They don't care which one, and accordingly we can leave it
> unspecified (an implementation detail) which would be selected.  This
> user doesn't constrain us.
>  
> Another user may expect --bikeshed=1.15 to select the _newest_ format
> 1.15 can write to.  This user is certain to exist; they're the reason
> we're writing this feature.
>  
> Yet another user may expect --bikeshed=1.15 to select the _oldest_
> format 1.15 can write to, you say in your next paragraph.  However,
> I don't see why any user would expect that. [...]

Yes, all fair. I think I now can accept that the current meaning
(highest format supported by the specified client version) is OK.

>> Consider [...] --wc-format=$(/opt/bar/bin/svn --version 
>> --show-item=wc-format-default)»
>  
> Clearer, yes, but does this serve a use-case?  Is there a user who would
> expect this to be possible?  As opposed to just "give me something 1.15
> can read and write"?

Maybe there is no need.

> Format numbers are sequential.  When upgrading from f30 to f40, there's
> no way to skip f35.  If we wanted that, we'd need some sort of
> capability-like mechanism.  Is that perhaps what you have in mind?  That
> a user might want to upgrade to 1.16 but not enable pristines-on-demand?
> If so, we'll need a way to enable/disable pristines-on-demand that isn't
> "format >= 32", as discussed previously.

I think we will need that, yes.

>> Potentially the user can select from "-default", "-min", "-max" variants
>> of that option.
>  
> KISS. [...]

Agreed; and I didn't mean we should expose that, just to use that idea
to help us think about the UI design.

> I suppose it's theoretically possible that there won't be a CLI
> incantation that will create f33; it will only be possible to create f32
> or f34.  Is this a problem?

No problem. (If, say, a hypothetical 1.16 would add support for two new
formats (f33,f34), I can't imagine any reason why we'd ever do that and
also need to provide users the ability to specify the non-highest one
(f33 in this example) directly.

>> My point is, using the running software version as a proxy for a WC
>> format introduces this ambiguity: [...]
>> This is why I think we should do at least one of:
>>  
>> - require the exact first-introduced version (1.8 or 1.15)
>  
> I still don't like the idea of requiring users to figure out somehow
> they should pass 1.8 when they want compatibility with 1.9.  That's
> a problem we should solve for them.

Maybe. I can see it both ways. Maybe best to allow the flexibility in
input ("=1.9") and make clear in the feedback (both immediate response
from the command, and "svn info" kind of feedback) that the format
chosen is compatible with "1.8".
  
>> - rename the option to use a less ambiguous language, to something like
>> '--wc-format-version=1.8' (meaning the version in which this format was
>> introduced) or '--wc-format=31'
>  
> Expecting a format number like this is basically asking the user to
> hardcode a magic number in their code, since there's no other way for
> them to learn the value "31".

OK.

[...]
>> > Why should we move any of that to include/private/? [except]
>> > SVN_WC__SUPPORTED_VERSION and SVN_WC__VERSION [...]
>>  
>> They are all a closely related family. The minimum format numbers for
>> old (no longer supported) features don't need to be used outside
>> libsvn_wc upgrade code, indeed. But the minimum format numbers for new
>> features that are within the range of supported formats DO from now on
>> need to be known by libsvn_client. A new one of them will be introduced
>> with format 32:
>>  
>>   #define SVN_WC__PRISTINES_ON_DEMAND_VERSION 32
>>  
>> We could split up the list... or keep it all together.
>  
> If it needs to be known by libsvn_client, then it should be in
> include/private/… unless there is some reason it should be public?

Indeed. I think "include/private" is right for now. Clients linking to
libsvn_client will also need to know something about formats... I'm not
sure whether to make these WC APIs public right away, so that any client
developers can get on with using them. If not, if we would be expected
to add alternative ways to access the info through libsvn_client public
APIs (that hide WC format number details and expose the info another way
(equivalent to the --bikeshed=1.15 UI and/or feature 

Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
>Also, unrelated: have we verified that all the temporary files we create
>are created in a crash-safe way?  I.e., that if libsvn_wc is SIGKILL'd
>partway through hydrating something, the something will be cleaned up by
>libsvn_wc at some point in the future?

I haven't reviewed for that. Could you perhaps record it somewhere more 
find-able? Not sure if it pertains to this issue thread or the whole i525.

- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
>> The request is to break the original design's invariant for this case.
>
> By only hydrating files that have been updated repository-side.  How
> will small, modified files that _haven't_ been remotely modified get
> hydrated, then?  The logic is the same for small and large files, IIUC.

They will get hydrated by any operation that needs them to be, such as diff and 
revert.

> Also, why is this specific to «svn update»? 

It's not specific to update. Update is a particular case that Karl cares about 
so I looked at doing "update" first. Implementing this approach in one 
subcommand at a time could be considered releasable incremental steps, because 
each one is a further optimisation.

> It seems to apply equally
> well to «svn diff» without further arguments, since the "large" files
> are presumed to be undiffable, 

That's not presumed. It's a rule of thumb that large files are often 
undiffable, but as yet there has been no plan to omit them from a requested 
diff by default. That could be a future sub-feature.

> If the issue does apply not only to 'update' but also to 'diff', that
> suggests we should look for a solution that applies to both of them
> (e.g., exclude "large" files from being recursed into by default, or
> make it so "large" files _never_ get hydrated).

That is a possible alternative to consider. Perhaps even a good one worth 
filing and discussing further.


- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
>Julian Foad wrote:
>> It might be *absolutely fine* for the real life users [...]
>
>So what are you saying?  That we should stop doing design discussions
>and go talk to users?  [...]

Sorry if that came across far too harsh. Looking back I see I phrased it as 
"What if...?" and your answer could be construed as a literal description of 
some technical consequences of such a work-around. What I meant was, what would 
those users feel about modifying their work flow with such a work-around? We 
don't know if any of those technical consequences would matter to them.

We're free to continue design discussions but I've limited time and need to 
focus. To me it appears we've moved far enough along this path of "some of our 
users want to do X" leading to "let's see how far we can implement an 
alternative" and now "let's consider the user's work-around options, and now 
"but the work-around has these consequences; mightn't that be a problem?". It 
seems to me we now know what are the two design directions, the original which 
is sub-optimal but near ready to use, and the alternative, now begun on its own 
"-issue4892" branch. I want to refrain from further speculation about how 
willing such a user would be to use the original design with work-arounds, and 
rather ask them first.


- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
> [...]I suspect I'm still missing something.

I suggest you re-read the issue 4892 use case: 
https://svn.apache.org/repos/asf/subversion/branches/pristines-on-demand-issue4892/notes/i525/i525-use-case-4892-minimal-update.txt

The request is to break the original design's invariant for this case.
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
> [...], ourselves.  Is HEAD of the branch good enough that devs
>with use-cases can start to try it in their real use-case wc's?  It
>won't be possible to downgrade f32 to f31, but [...]

I think so. Anyone trying it should take a quick look through the known bugs 
filed as dependent on #525, and be aware that with any WC implementation bugs 
there's the possibility of getting the WC metadata into a stuck or corrupt 
state. That said, it's good enough for testing.

>I'll also mention asciinema.  It's basically script(1) into a video
>hosted online.  It might be instructive for us to watch an asciinema
>session of someone trying this branch for the first time.  It's about as
>near as we can get to standing behind their shoulder, without actually
>sharing a machine with them and watching a shared tmux(1) session.

Good idea. Anyone willing to try it?

- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
>> Stick with the idea, for now, that we do need to handle that "restore"
>> part of update.
>
>Can we deprecate it?

People already argued for keeping it. No need to spend more time discussing 
that now, as I pointed out the effort required to make it work this new way 
(fetch at point of use, for want of a better term) doesn't look large.

>> [...] next a similar pattern applies to the "normal" part of the
>> update (everything it does after "restore"). Obviously we need the
>> normal part of update
>
>Yes, but for the "deltas" part of update we already mostly DTRT, don't we?
>
>- If the file is not modified, [...]
>
>- If the file is locally modified, then by design, we need to end up
>  with a pristine for it.  Right now we'll download BASE, and then
>  [...]  What am I missing?

You're missing the case where the file is locally modified, and is in the tree 
scope of the update request, but no update is found in the repo. Currently we 
download its base before executing the business logic of update, so before we 
know that we're not going to need the base to complete this update request.


- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
>This implies the wc won't be uniform revision.  This might break user
>expectations; might [...], 
I'm not sure how your clarification helps us progress. The point is:

It might be *absolutely fine* for the real life users in their real life 
situations, and that's what we need to find out.
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-16 Thread Julian Foad
Daniel Shahaf wrote:
>Julian Foad wrote:
>> exploration was enough to show that an initial release based on the
>> original approach has possibilities of being improved, incrementally, in
>> that way, as and when resources permit.
>> 
>> In other words I am not recommending choosing one approach and
>> abandoning the other, but starting with one and postponing the other as
>> possible future improvement work.
>
>Sorry, but could you spell out what are the "one approach" and "the
>other"?  Are you proposing to release the code as it is, fetching in
>advance, and saying you're confident it can in the future be taught to
>fetch during the operation, notwithstanding kotkov@'s points about
>RA-level timeouts?
Yes; while uncertain how much effort it might require to overcome the concerns 
such as RA-level timeouts.
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-15 Thread Julian Foad
Just an addendum, perhaps a more positive portrayal of the brief
exploration of the alternative design approach: my assessment is that
exploration was enough to show that an initial release based on the
original approach has possibilities of being improved, incrementally, in
that way, as and when resources permit.

In other words I am not recommending choosing one approach and
abandoning the other, but starting with one and postponing the other as
possible future improvement work.

- Julian



Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-15 Thread Julian Foad
The patch I sent is now committed in r1898948 as a new branch,
'pristines-on-demand-on-issue4892', for easier dev/test access.


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-14 Thread Julian Foad
Johan Corveleyn wrote:
>Well, as I said, I don't have huge binaries myself [...]. That's just me 
>speculating of course [...]
Hi, Johan. I really appreciate you taking the time to write all your thoughts. 
That kind of speculation has its place and was certainly useful earlier, but 
we're now at a point where it could be counter-productive. I think we now need 
to focus on real use cases and real testing.
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-14 Thread Julian Foad
Daniel Sahlberg wrote:
>[...] I will try to build a release for myself and use it for dev work.
Thank you Daniel.

I'm wondering if I (or we) need to do more to facilitate evaluation. I'm 
thinking of things like adding some feedback to tell the user what it's doing 
("fetching missing pristines now..."), maybe at an extra verbose level during 
this evaluation phase to help users understand it; finding out if any of the 
outstanding issues need fixing in order to be able to use it productively; 
maybe getting binaries built and distributed if that helps; maybe we can supply 
more succinct user documentation than what I wrote so far?
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-14 Thread Julian Foad
Johan Corveleyn wrote:
>Speaking from the peanut gallery, [...]
>If I would be a user with several huge binaries in the repo / WC, I
>imagine I would not be happy with this proposal. The reason is that I
>have always, forever, only done "svn update the-whole-wc". Updating
>individual subdirs is micro-managing. [...]
But do you locally modify those huge binaries?
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-14 Thread Julian Foad
Mark Phippard wrote:
> [...] I just wanted to bring some user perspective [...]
Thanks, Mark. Understood.

I also want to clarify that this is my pragmatic side speaking. For anyone who 
doesn't remember me saying this before, I'll repeat that my purist side would 
love to see us do the alternative, optimal, design, and work through the 
consequent protocol issues. That seems obviously to me more "Right", but is 
only useful if we could afford to follow it through to completion, and we don't 
have a good estimate of the effort required for that.

- Julian

- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-14 Thread Julian Foad
Dear dev community, and especially Karl and Mark:

A plea to test the current design/implementation.

I wonder if we are missing some perspective.

We are worried that the current design won't be acceptable because it
has poor behaviour in a particular use case.

The use case involved running "svn update" at the root of the WC. (It
didn't explicitly say that. More precisely, it implied the update target
tree contains the huge locally modified file.)

Using this new feature necessarily requires some adjustments to user
expectations and work flow.

What if we ask the user to limit their "svn update" to target the
particular files/paths that they need to update, keeping their huge
locally modified file out of its scope? Examples:

svn update readme.txt
svn update small-docs/
# BUT NOT: svn update the-whole-wc/

Then we side-step the issue. It only fetches pristines for modified
files that are within the tree scope of the specified targets. (This is
how it works already, not a proposal.)

OK that's not optimal but it might be sufficient.

(Of course there are further concerns, such as what happens if the user
starts an update at the WC root, then cancels it as it's taking too
long: can we gracefully recover? Fine, we can look at those concerns.)

I can go ahead with further work on changing the design if required, but
I am concerned that might not be the best use of resources. Also I don't
know how to evaluate the balance of Evgeny's concerns about protocol
level complexity of the alternative design, against the concerns about
the present design. In other words pursuing that alternative seems
riskier, while accepting the known down-sides of the current design is
sub-optimal but seems less risky.

Should we first test the current design and see if we can work with it,
before going full steam ahead into changing the design?

The current design/implementation (on branch
'pristines-on-demand-on-mwf') is in a working state. There are open
issues that still need to be resolved, but it's complete enough to be
ready for this level of testing.

- Julian



Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-13 Thread Julian Foad
"Restore" isn't involved in our use cases. The only reason I mentioned the 
"restore" functionality in the first place is because my proof-of-concept patch 
deliberately leaves that broken because it's non-core functionality, that also 
will need modifying to make it work in the new way if we proceed that way. I 
don't expect it to be particularly difficult to fix; no more difficult than 
modifying the main update functionality. I wish I hadn't mentioned the 
possibility that we might choose to ignore this minor sub feature, it's just 
distracting us from discussing the main functionality.

Understanding the current issue about pristines and updates can be achieved 
while completely ignoring "restore".
- Julian


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-11 Thread Julian Foad
Mark Phippard wrote:
> Is there a way to describe it in a way that a really experienced SVN
> "user" not "developer" would understand? Set aside the pristinelsss WC
> feature. What is the scenario in previous versions of SVN where this
> behavior is kicking in?

Hi, Mark. I have been mainly aiming my writing to those taking a
detailed look and reviewing the code, while at the same time I don't
want to leave anyone here, even on the periphery of the development
community, in the dark. I'll give it one more shot in this subthread.

The current approach adds a "textbase sync" phase before every
operation, which is crudely, pessimistically, fetches pristines for all
locally modified files within the tree scope of the operation's target
paths, if they are (knowingly) absent from the pristine store. This
approach then uses invokes the old business logic of the operation, with
little modifications inside it, in particular no callback to fetch
pristines on demand after this point.

The "restore missing files" is an odd thing that "update" does outside
of its core purpose: when any versioned file in its scope is missing
from disk, it puts an unmodified copy of the file back on disk there, by
copying (and translating keywords/eol-style) from its pristine version.
It's something like an implicit revert for files that were deleted from
disk without using "svn delete". Why so, I don't know, and I don't like
it, but it's there, so let's go on to look at how it interacts with
absent pristines.

If we have pristines absent on purpose (as we do in this branch), and we
simply disable the "textbase sync" a.k.a. "hydration" that runs at the
start of an update in our current approach, then that "restore" part of
the update errors out (unable to read the pristine) when it encounters a
file missing on disk.

It's probably a rare case in most users' work flows but it's something
that exists we have to deal with one way or another, even if by
declaring it no longer supported and changing test expectations accordingly.

Stick with the idea, for now, that we do need to handle that "restore"
part of update. The alternative approach is to add a callback to the
lower level "get the pristine" function that it uses. Then it would be
able to fetch what it needs when it needs it and not fetch anything it
doesn't need. Evgeny cautions us about that alternative approach, but
*in principle* if we could get the protocol level behaviour absolutely
right, that would (I think) surely be better.

That "restore" happens, within update, before the server tells us
whether any update of that particular file is actually present on the
repository. (Perhaps it could be moved to afterwards; I haven't
investigated that possibility.)

Then, never mind whether we care about supporting that "restore" thing;
because next a similar pattern applies to the "normal" part of the
update (everything it does after "restore"). Obviously we need the
normal part of update

> [...] From my laymans point of view, we have a database in
> the WC that says what we have. I assumed we largely would be using
> that information when talking to the server about what it needs to
> send us to do an update.

Well, yes and no in the current two-phase (hydration, then operation) approach.

> So I am just not getting why the server needs
> to send us a file that WC already has.

It doesn't ever send us a file that the WC already has. The issue we're
concerned about is it sending a pristine that is (knowingly) absent from
the pristine store, but for a file whose pristine is not going to be
looked at during the current update. It might be needed by some future
operation but the current approach fetched it pre-emptively (by design)
but for this use case we would rather not do that.

> I know your answer will be the "restore situation" [...]

That's not the essential part, no; the main part of "update" is
obviously the essential part, and it has similar characteristics and
options, and isn't optional.

- Julian



Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-11 Thread Julian Foad



On Mar 11 2022, at 5:07 pm, Mark Phippard  wrote:

> On Fri, Mar 11, 2022 at 10:24 AM Evgeny Kotkov
>  wrote:
>> 
>> Julian Foad  writes:
>> 
>> > Conclusions:
>> > 
>> >
>> > It is certainly possible that we could modify "update" and the other
>> > "online" operations, at least, and the previously "offline" operations
>> > too if we want, to make them fetch pristines at point-of-use in
>> this way.
>> >
>> > Such modifications are not trivial. There is the need to run additional
>> > RA requests in between the existing ones, perhaps needing an additional
>> > RA session to be established in parallel, or taking care with inserting
>> > RA requests into an existing session.
>> 
>> I think that this part has a lot of protocol constraints and hidden 
>> complexity.
>> And things could probably get even more complex for merge and diff.
>> 
>> Consider a bulk update report over HTTP, which is just a single
>> response that
>> has to be consumed in a streamy fashion.
> 
> Feel free to ignore this question because it is not going to help us
> get to a solution, but there is something about all of this I do not
> understand.
> 
> Today, in normal SVN, if I do svn update my assumption is the client
> and server negotiate a fairly efficient reply. The server only sends
> the client the data it needs to update.  [...]

Correct.

> This is where the question comes in ... why does not having the
> pristines change this? The WC still knows what files it has and what
> revisions. Isn't this what drives the process? I just do not
> understand what has changed that forces the server to send the client
> the version of a file that the client already knows it has.

My best recent explanation I can offer was in my reply about 5 messages
ago, at 10:36 UTC, with subsections "Restore is an aberration" and
"Merge". Not sure what exactly you're missing; maybe not seeing the
overall difference between the two approaches that are currently being compared?

- Julian



Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-11 Thread Julian Foad
Thank you, Evgeny. That is exactly the kind of discussion we need, and
you were able to provide far more detailed insights than I was. That
should help us decide how to proceed.

As for your thoughts about the current approach for MVP, I tend to agree
that your approach is likely to be useful for a lot of people's use
cases. Unfortunately it has turned out that Karl has one of the use
cases that doesn't match those assumptions, where it doesn't work so well.

Karl, would it make sense now for you to search and compare use cases
that concern your group, and see if cases that don't work well with the
current design (such as the particular case we are focusing on right
now, r1898846 'notes/i525/i525-use-case-4892-minimal-update.txt') are of
majority or minority concern overall?

One thing we could do is to more formally document use cases in order
for potential users to spot which ones match their cases.

One thing we could do is release (preview) versions of both approaches and
get folks to evaluate them.

We need to decide how to spend the next effort here.

- Julian



Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-11 Thread Julian Foad
With a dive into the main "update" code, I was able to make "update"
fetch pristines at the point of use, and so minimally the ones it really
needs... I think.

So far I have only got it running with "restore" functionality disabled,
and run the test suite. I get the (ten) expected fails from the lack of
"restore", and some more. But not too many more. Just one in fact:

FAIL:  externals_tests.py 68: check file external recorded info

So this is looking promising.

Patch attached.

That's probably all for today.

- Julian


i525-i4892-update-minimal-hydrate-1.patch
Description: Binary data


Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-11 Thread Julian Foad
A quick dive in the "restore" code path led me to:

  - commit a small refactoring (r1898847 on trunk) to deduplicate the
code, which should be useful if we need to do anything like adding
callbacks to it;
  - observe that if we disable "restore", 10 tests fail (4 update tests
and 6 others).

It may be that the effort of adjusting test expectations and other
fall-outs (like documentation updates) might not be smaller than the
effort of implementing the callback.

I will leave off looking further at the "restore" for now, and do some
further investigation in the main update code path.

- Julian



Re: A two-part vision for Subversion and large binary objects.

2022-03-11 Thread Julian Foad
Here is an approach that does *not* satisfy both sides of this argument:

[[[
svn propset "svn:no-pristines" "*" doc/

cat >> ~/.subversion/config <<-EOF
[auto-props]
src/**/*.exe = "svn:no-pristines = *"
EOF
]]]

and we make standard Subversion control its pristine storage based on
looking for the versioned property "svn:no-pristines".

Here is an approach that *does* satisfy both sides of this argument. It
is indirect.

First step, we introduce one level of indirection, so that client
behaviour knobs are not attached directly to the versioned data:

[[[
svn propset "danielsahlberg:no-pristines" "*" doc/

cat >> ~/.subversion/config <<-EOF
[auto-props]
src/**/*.exe = "danielsahlberg:no-pristines = *"
[working-copy]
omit-pristines-where-this-prop-is-set = "danielsahlberg:no-pristines"
EOF
]]]

and we make standard Subversion control its pristine storage in
accordance with the config option "omit-pristines-where-this-prop-is-set".

Second step, we provide server-side configuration for automating those
config settings, as follows:

[[[
svn propset --revprop -r0 \
"svn:server-dictated-config:config:auto-props:src/**/*.exe" \
"danielsahlberg:no-pristines = *"
svn propset --revprop -r0 \
"svn:server-dictated-config:config:working-copy:omit-pristines-where-this-prop-is-set"
 \
"danielsahlberg:no-pristines"
]]]

and we make standard Subversion read config options from the
repository's r0 revprops, and use those as default values for local
config options.

This is not a concrete proposal, just trying to make a clear explanation.

- Julian



Re: Issue #525/#4892: on only fetching the pristines we really need

2022-03-11 Thread Julian Foad
Thinking about next steps. It seems worth investigating how feasible is
making at least "update" hydrate pristines at point-of-use 
(optimally/minimally).

"Restore" is an aberration
--

The first thought I had after sending that long post is that the
"restore" functionality of update is, to my mind, a historical
aberration. It's not consistent with anything: it's an intentional
revert of one of the user's local modifications, and we explicitly avoid
doing anything like that except in this one case. I don't know its
history except I believe I learnt it was copied from CVS. It would not
surprise me if in fact it was originally introduced to work around some
flaw rather as an intentional user feature. (Anyone know... Karl?)

So, it would not be totally unreasonable to declare that we
intentionally omit that functionality when in pristines-on-demand mode,
at least for the first cut. There's one simplification.

"Merge"
---

First, an addendum to what I wrote under "Deltas: working file is
*unmodified*": I described this step leaving the new pristine in the
pristine store. But we don't want to leave it in the pristine store
afterwards, because by definition this is a case where the file is
unmodified. Options, with their pros and cons, are as I described under
the "restore" case: either to avoid the store at that point or to let it
be stored for a moment and have the clean-up remove it toward the end of
the update operation.

Now, is there any shortcut we can make on the 3-way merge aspect of update?

The use case described in [1] is now (r1898846) committed on the branch
as 'notes/i525/i525-use-case-4892-minimal-update.txt'. In that use case,
the update brings in changes to small files that are locally modified;
but it does not bring in any changes to the huge files.

We want to avoid fetching the pristine of the (huge) locally modified
files which receive no update; while the update is still expected to
merge changes into the (small) locally modified files. (I am saying
"huge" and "small" to help us visualise the case; in our current short
term plans we don't expect the update operation to change its behaviour
depending on file size.)

Before I speculate and write more on this, I will take a deeper swim in
the source code and see what more I can learn or hack together.

- Julian


[1] https://lists.apache.org/thread/t7y09576tz5xcqhwzqys3t0vfbdpg861 on
dev@ from Julian Foad at 2022-03-04T20:52:38Z.



Re: http URLs should be updated to https

2022-03-11 Thread Julian Foad
Julian Foad wrote:
> +1. Can you send a patch?

By the way, the reason I ask if you would be willing, rather than "just
quickly doing it" myself, is even a small "obvious" fix like this tends
to require more than it initially looks like: checking if it's already
done in head of trunk, finding other similar places, deciding if any of
them shouldn't be updated for whatever reason, running the test suite,
adjusting test results to match...


Re: http URLs should be updated to https

2022-03-11 Thread Julian Foad
+1. Can you send a patch?
- Julian


Re: A two-part vision for Subversion and large binary objects.

2022-03-11 Thread Julian Foad
Daniel Sahlberg wrote:
> I'm taking an opposite position with regards on where this should be
> administred. [...] I would prefer a multi-level approach where the
> repository (through svn:foo properties) could suggest pristine-less WC

I understand completely your case, but the solution you need is a way to
configure your client's behaviour remotely, and that is not necessarily
best done by Subversion versioned properties. Do you see the
distinction? Rather, what you need is for client configuration to be
managed centrally and obeyed by your clients. The server and clients
involved *could* be your Subversion repository server and Subversion
clients, but could alternatively be some other mechanism. You just need
some mechanism that works and is easy enough to deploy.


Issue #525/#4892: on only fetching the pristines we really need

2022-03-10 Thread Julian Foad
This is an investigation into changing the "pristines-on-demand"
approach to follow a principle that each operation would only fetch the
pristines it really needs.

I have begun a "user guide" ( notes/i525/i525-user-guide.md ), with the
aim of explaining the principle of operation of the current approach,
along with its expectations and limitations. Note well that the current
approach is based on a *different* principle from "each operation only
fetches the pristines it really needs".

As a reminder, the present design is based on a fetching paradigm that
is up-front and pessimistic: before any operation that *might* need
pristines, it ensures it fetched sufficient (but perhaps more than
necessary) pristines. After that fetching phase (see
'svn_client__textbase_sync'), it then runs the original operation code
path, assured that the operation will run correctly in its existing
form, without needing to be modified to support fetching via a deep
(point of use) callback.


Online vs offline operations


I want to draw a distinction, which may or may not help here, between
operations that were already "online" (required contacting the repo) and
those that were previously "offline" (local only).

The previously "online" operations include "update" of course, and
"switch" and "checkout --force" (both being sisters of update), and
"merge", and the forms of "diff" that compare base to repository.

Any online operation is going to connect to the repository anyway, in
its normal (previous) operation. When the current design deems that such
an operation needs to hydrate the pristines before it starts, this
"need" is more of a "uses in its current implementation". In principle
we could change its implementation to move the fetching of pristines
down the call stack to the point where it actually needs them, and so
ensure optimal fetching – in the sense of fetching only those it really
needs, and only when it really needs them.

This change would cause an increase in network traffic whenever a needed
pristine is missing; but only an increase. Because these operations are
already online, it would not cause any substantial qualitative
difference to the user experience or to the high level client software's
need to handle repository connection and authentication.

Now contrast this with the previously "offline" operations.

If we change a previously "offline" operation (local diff, revert, etc.)
to fetch only the pristines it actually needs, by pushing the fetch
callbacks down the call stack to the point of use, that would lead to a
qualitatively different user experience and high level client software
usage pattern. (Previously discussed. In short: the callback and need
for authentication, which may require user input, may come at any point
after the operation has started, where for example a GUI tool may be in
the middle of displaying a series of file diffs.) I do not know how much
of an issue that might be, but some people have expressed concern.

Perhaps a useful compromise could be:

  - for the "online" operations only, fetch at the point of use
(optimal: only fetching the pristines they actually need); and
  - retain the pessimistic up-front sync paradigm for the "offline"
operations (so avoiding the callback awkwardness for them).

That's just for consideration, not a strong recommendation.

Now, let us take a look at "update" in particular, because it came up as
a problem in a primary use case that prompted me to file issue #4892.


Why and how does "update" currently require pristines?
--

Note that update involves TWO pristines for each file: the old one
that corresponds to the old base revision before the update, and the new
one that corresponds to the new base revision after the update.

Update currently uses pristines in two distinct ways:

  - [deltas] The update code reports the needed update in terms of a
delta against the (old) base revision, on the assumption that the client
has a pristine copy of the base revision. The repository duly sends such
a delta. The WC layer then attempts to apply the delta it receives, and
at that point attempts to open and read the old pristine, in order to
apply the delta to create the new pristine.

  - [restore] The update code also looks for files that are missing on
disk (if the 'restore_files' option is passed, which it usually is), and
restores them by reading and translating their pristines. It restores
files on the reporting side (in svn_wc_crawl_revisions5), before
reporting the state of each file.

What would it take to modify "update" to fetch at point of use?
---

For the Deltas:
---

The relevant sub-case is a file with local modifications. (For an
unmodified file it can reconstruct the pristine on the fly.)

If the working file has local modifications, then after the base is
updated, there is a 3-way merge to 

Re: [jira] [Commented] (SVN-4884) multi-wc-format: user visibility of WC version (info --show-item)

2022-03-10 Thread Julian Foad
Daniel Shahaf wrote:
> Daniel Shahaf commented on SVN-4884:
> 
> Done in r1898739.

(Added the WC format and version to `svn info`'s other output modes.)

Great! Thanks, Daniel.

By the way, I see you cherry-picked a commit from trunk as the commit
before this fix. No problem, but this branch tracks trunk so we can do a
catch-up merge as often as needed.

$ svn merge ^/subversion/trunk
$ svn ci -m "On the 'pristines-on-demand-on-mwf': sync with trunk@."

- Julian



Re: multi-wc-format review

2022-03-10 Thread Julian Foad
Nathan Hartman wrote:
> Suggestion: The user provides the earliest SVN version with which they
> want compatibility, and SVN picks the latest WC format version that is
> compatible with it. [...]

And Daniel Sahlberg wrote:
>> Regarding how to handle --wc-format-version=1.9, I'm leaning towards
>> "use the newest format that was known by 1.9". That would enable a
>> use case where I check the version of "the other svn client" (maybe
>> an IDE integrated client), find that it is X.Y and then just use that
>> version number instead of having to find out when the wc format of
>> X.Y was actually introduced.

Daniel and Nathan: That's what the current implementation does. It
sounds rational by itself, but I pointed out some problems. That rule is
not immediately obvious when, for example, a naïve user sees
'--compatible-version=1.15' in a script. And (more so because) it is not
consistent with the format that svn 1.15 chooses as its default format
for creating working copies. By default it chooses 1.8 format (f31), not
1.15 format (f32). Your views on that would be good to hear.

- Julian



Re: Issue 525 pristines-on-demand: user guide

2022-03-10 Thread Julian Foad
Karl Fogel wrote:
> Might it be better to put that file on the branch, [...]

Yes, sure; I'm not really sure why I didn't.

There, r1898819, in 'notes/i525/i525-user-guide.md'.

- Julian


Re: multi-wc-format review

2022-03-09 Thread Julian Foad
On Mar 8 2022, Daniel Shahaf wrote:
>>   By default Subversion will upgrade the working copy to a version
>>   compatible with Subversion 1.8 and newer.
>  
> Are we assuming that future minor versions (1.16, 1.17, etc.) will all
> continue to be able to read/write f31?  This seems to be implied by
> the language [...]

It seems desirable at this time, now that we have the mechanism to do
so, that we would continue supporting that format for a good long while;
but of course I would not promise indefinite support.

What I was thinking is we can use language like that for 1.15, and then
in a future version if and when we drop support, we can at the same time
change the language (and if necessary the format reporting APIs) to
report more specific lists of support.

Specifically, I felt it was OK to use language like "and newer" in a
given release of the tool, without being construed as a promise about
versions newer than this version. If anyone thinks anything here could
seriously mislead, point it out and let's change it.


Daniel Shahaf wrote:
> Julian Foad wrote on Thu, Mar 03, 2022 at 10:53:13 +:
>> [...] it seems clear to me now that we need to expose
>> [WC format numbers]. [...]
>  
> Would you elaborate? [...]
> What are the awkward semnatics?  What are the inconsistencies?  What
> questions would API users be able to answer for themselves if we hand
> them format numbers, that they can't easily answer with trunk@HEAD?

Some examples below.

>> If we're going to have version numbers as the 'compatible version' UI
>> option, perhaps we should eliminate these issues by requiring to
>> specify the exact first-introduced version, MAJOR.MINOR format (with
>> no .PATCH, no -TAG).
>  
> _Prima facie_ I would -0 this, because it should be possible to do
> «/opt/foo/bin/svn upgrade --compatible-version=$(/opt/bar/bin/svn
> --version --quiet)» even when bar is v1.9.

I suppose in that example the intention is "I want it to be compatible
with version , without me having to learn what formats 
supports." The current option parsing seems to have a design intent that
reflects this usage. And it is a reasonable use case at first sight.

Suppose '/opt/v1.19/bin/svn --version' outputs:
* WC format 31, compatible with Subversion v1.8 and newer
* WC format 32, compatible with Subversion v1.15 and newer

Which WC format did our hypothetical user want? (Rhetorical.) The
current implementation gives them the highest format compatible with the
requested version. But contrast that to our planned behaviour of 1.15
which is to select the older format (31) by default. To me, that is
already showing the cracks in this approach.

Consider if we replace your example with (hypothetical UI):

«/opt/foo/bin/svn upgrade --compatible-version=$(/opt/bar/bin/svn
--version --show-item=wc-compatible-version-default)»

which I intend should select compatible-version=1.8. That is getting a
bit clearer, less ambiguous, isn't it? Then a step clearer:

«/opt/foo/bin/svn upgrade --wc-format=$(/opt/bar/bin/svn --version 
--show-item=wc-format-default)»

Potentially the user can select from "-default", "-min", "-max" variants
of that option.

(To be clear, I am not proposing we should support such a UI, just
making a point about clearer semantics of any UI that deals with formats.)

My point is, using the running software version as a proxy for a WC
format introduces this ambiguity: is it going to mean the minimum
version it supports, its default, or its highest supported format? The
language "compatible version" is not clear. One version supports N
formats, and one format is supported by M versions, in general. In
saying 'compatible version 1.15' it's not necessarily clear to the user
how the quoted software version relates to one of the formats that
version supports.

This is why I think we should do at least one of:

- require the exact first-introduced version (1.8 or 1.15)
- rename the option to use a less ambiguous language, to something like
'--wc-format-version=1.8' (meaning the version in which this format was
introduced) or '--wc-format=31'

I think those both make it clearer for the user. We need especially to
be aware of users encountering this option for the first time, and not
deeply knowing what they are doing. A user wanting a format compatible
with Subversion minor version , without considering that there may
be more than one compatible format to choose from, may be misled if we
silently pick one and don't make that clear.

>> In exposing WC format numbers in the APIs, certainly we should treat
>> them as opaque identifiers with no intrinsic meaning to their numeric
>> value. We could of course also introduce non-numeric identifiers for
>> them, for avoidance of 'magic constants' in source code.
>  
> I'm not sure I understand what you have in mind 

Re: Issue 525 pristines-on-demand: user guide

2022-03-09 Thread Julian Foad
Daniel Sahlberg wrote:
> [...] an "iff". I guess this might be the "if and only if" meaning [...]

Yes it is. OK, can change.

> [...] mismatched quotes.

Thanks. Both fixed now in my local copy.

More substantive reviews are welcome too :-)

- Julian



Issue 525 pristines-on-demand: user guide

2022-03-09 Thread Julian Foad
I have started a detailed user guide for the current design.

Attached to issue: 
Markdown rendered: 

It is initially intended to help ourselves understand exactly what the
current design is and does, especially in the light of possible changes
to this design.

Secondly of course the final version is intended to be for users. It
might currently be a bit too detailed for a final user guide, because of
initially being aimed at the dev community.

The next thing I would like to add to it is a worked example or two.

- Julian



Re: svn commit: r1898525 - in /subversion/trunk/subversion: include/svn_client.h libsvn_client/upgrade.c libsvn_wc/wc.h svn/help-cmd.c svn/info-cmd.c svn/svn.c

2022-03-09 Thread Julian Foad
> Recommend to remove RESULT_POOL too, since it is unused and the return
> type is a pointer-to-const.

You're welcome to if you feel stronger than I do but it's off the bottom
of my priority list :-)

- Julian



Re: A two-part vision for Subversion and large binary objects.

2022-03-04 Thread Julian Foad
> Mark Phippard wrote:
>> [...] For an update, I think it is unexpected and undesirable. [...]

I had a talk with Karl about this, and now I understand the concern much better.

(Karl, please correct anything I misrepresent.)

He shares the view that it would be unacceptable for 'svn update' to
fetch pristines of files that have become locally modified since the
previous fetch opportunity, but that are not actually being updated by
this update.

In his use cases a developer locally modifies some large files. The
developer also modifies some small files (such as 'readme' files
describing the large files). The developer doesn't need to diff or
revert the large files, and so chooses the checkout mode which doesn't
keep the pristines initially.

Before committing, the developer runs 'update', expecting to fetch any
remote changes to the small files (and not large files, not in this
case), expecting it to be quick, and then the developer continues work
and eventually commits.

The time taken to fetch the pristines of the large, modified files would
be long (for example, ten minutes). Taking a long time for the commit is
acceptable because the commit is the end of the work flow (and the
developer can go away or move on to something else while it proceeds).
The concern is that taking a long time at the update stage would be too 
disruptive.

It wouldn't be a problem for an operation that really needs the
pristines taking a long time. (Revert, for example.) The perception is
that update doesn't really need them. That is, while it obviously needs
in principle to fetch the new pristines of the files that need updating
to a new version from the server (or fetch a delta and so be able to
generate the pristine), it doesn't, in principle, need pristines of
files that it isn't going to update. In this use case, it isn't going to
update the large, locally modified files. And fetching their pristines
wouldn't massively benefit the commit either, because they are poorly
diffable kinds of files. So it is wasted time.

If the implementation currently requires these pristines, that would
seem to be an implementation detail and we would seek to change that.

So my task now is to investigate for any way we can eliminate or
optimise the unnecessary fetching, at least in this specific case.

Filed as https://subversion.apache.org/issue/4892 .

I will investigate this issue next week.

- Julian



Re: multi-wc-format review

2022-03-03 Thread Julian Foad
I can think of a number of further API clean-ups possible, related to
multi-WC-format support.


Commentary:

At first we were trying to keep knowledge of WC format numbers internal
to libsvn_wc. However it seems clear to me now that we need to expose
them. The rest of the system does need to know about different formats.
Compatible-version numbers are not such good identifiers because a range
of version numbers maps to each single format (even though there is one
version number where the format was first introduced). To be more
specific, our UI currently accepts
'--wc-compatible-version=1.14.0-alpha' to specify the format introduced
in 1.8; that mapping is not one-to-one, leading to slightly awkward
input and output semantics and conversion APIs, quite possibly already
having inconsistencies. If we're going to have version numbers as the
'compatible version' UI option, perhaps we should eliminate these issues
by requiring to specify the exact first-introduced version, MAJOR.MINOR
format (with no .PATCH, no -TAG).

In exposing WC format numbers in the APIs, certainly we should treat
them as opaque identifiers with no intrinsic meaning to their numeric
value. We could of course also introduce non-numeric identifiers for
them, for avoidance of 'magic constants' in source code.

So it now seems to me the right path is to expose WC format numbers, and
WC-format-related APIs, deliberately.

I am not sure how best to distribute all these APIs between libsvn_wc
and libsvn_client, and how public to make each of them. Most of them
probably ought to be in libsvn_wc because they are meaningful there and
libsvn_wc itself and small apps built on it (tests at least) may need
them without wanting to pull in libsvn_client. Of course typical clients
mainly want to deal with libsvn_client, and to know about what that
client library supports, so would initially expect to find the public
APIs they need there, but it is no hardship for them to use (public)
libsvn_wc APIs for their WC-specific needs.


Some possible API clean-ups:

* Move SVN_WC__VERSION and friends from wc.h, at least to the
inter-library scope of 'include/private/svn_wc_private.h', so they are
properly accessible to libsvn_client and the other libraries. (Avoiding
the need for "#include ../libsvn_wc/wc.h".) The format-related
definitions there are:

- version history of all the WC formats
- SVN_WC__VERSION, SVN_WC__SUPPORTED_VERSION, SVN_WC__DEFAULT_VERSION
- SVN_WC__*: minimum format for certain features
- svn_wc__version_string_from_format()
- SVN_WC__ERR_IS_NOT_CURRENT_WC()

Of these, 'SVN_WC__SUPPORTED_VERSION' and 'SVN_WC__DEFAULT_VERSION' are new.

(Also, rename 'SVN_WC__VERSION' to include the words 'maximum/latest
supported'; and rename 'SUPPORTED_VERSION' to include the word
'minimum/oldest'; and group those "SVN_WC__*: minimum format ..."
definitions with their own common prefix.)

* The definitions of oldest/default/newest supported WC format could be
more consistently named and less duplicated. Currently there are:

SVN_WC__VERSION
== svn_wc__max_supported_format()
~= svn_client_latest_wc_version()

SVN_WC__SUPPORTED_VERSION
== svn_wc__min_supported_format()
~= svn_client_oldest_wc_version()

SVN_WC__DEFAULT_VERSION
~= svn_client_default_wc_version()

svn_wc__is_supported_format()
overlaps with svn_client_get_wc_versions_supported()

Of these, all are new except 'SVN_WC__VERSION'.


I don't want to do any unnecessary churn, but maybe we should do some of
this, especially any that is in the public API name space.

Thoughts? Volunteers? :-)

- Julian



Re: svn commit: r1898524 - in /subversion/trunk/subversion: svn/help-cmd.c tests/cmdline/getopt_tests_data/svn--version--verbose_stdout tests/cmdline/getopt_tests_data/svn--version_stdout

2022-03-02 Thread Julian Foad
Julian Foad wrote:
> Daniel Sahlberg wrote:
>> Should the following lines also be removed [...]
> Oh yes. Thank you. Please do if you have a moment, otherwise I'll get
> to it some time this week.

I got to it now. r1898535.

I note that we might want to replace that with a matcher for variations
of the new style, but I saw there didn't seem to be a trivial way to do
so at the moment, and it can wait, so I didn't.

Thank you very much for reviewing these commits. Please keep it coming!

- Julian



Re: svn commit: r1898524 - in /subversion/trunk/subversion: svn/help-cmd.c tests/cmdline/getopt_tests_data/svn--version--verbose_stdout tests/cmdline/getopt_tests_data/svn--version_stdout

2022-03-02 Thread Julian Foad
Daniel Sahlberg wrote:
>Should the following lines also be removed [...]
Oh yes. Thank you. Please do if you have a moment, otherwise I'll get to it 
some time this week.
- Julian


Re: A two-part vision for Subversion and large binary objects.

2022-03-02 Thread Julian Foad
Mark Phippard wrote:
> That comment specifically talks about diff. [...] For an update, I
> think it is unexpected and undesirable. [...]

You're right, the comment I pointed to doesn't do anything to justify
why 'update' should fetch it. And I agree it would be better if it
didn't. Maybe we will be able to optimise that case. I don't know.

- Julian



Re: multi-wc-format review

2022-03-02 Thread Julian Foad
On Feb 25 2022, Daniel Shahaf wrote:
>> [...] we are now deliberately choosing compatibility as the default,
>  
> OK.  However, in this case we should document this explicitly, since the
> book explicitly documents that «svn upgrade» would upgrade to the "most
> recent metadata format supported by your version of Subversion"
> (https://svnbook.red-bean.com/nightly/en/svn.ref.svn.c.upgrade.html).
>  
> I've grepped the book for «upgrade<» and I believe the only other place
> there we need to update is
> https://svnbook.red-bean.com/nightly/en/svn.ref.svnadmin.c.upgrade.html,
> to highlight the distinction.
>  
> A release notes entry would be higher priority than updating the book.

Documented in the help text of 'checkout' and 'upgrade', in r1898527,
like this for 'upgrade' and similar for 'checkout':

[[[
  By default Subversion will upgrade the working copy to a version
  compatible with Subversion 1.8 and newer. To upgrade to a different
  version, use an option such as '--compatible-version=1.15'.
  The versions available are the same as in the 'checkout' command.
  Use 'svn --version' to see the compatible versions supported.
]]]

- Julian



Re: multi-wc-format: upgrading externals

2022-03-02 Thread Julian Foad
Daniel Shahaf wrote:
> multi-wc-format/BRANCH-README mentioned this:
>  
>> [*] New externals working copies must inherit the format from their
>>parent working copy, because [...]
>  
> Upgrading a parent working copy upgrades external wc's too.  However,
> upgrading an external succeeds.  Judging by the quoted remark, should
> «svn upgrade --compatible-version=$N /path/to/external» error out unless
> the external's parent working copy is already at version $V?

It isn't clear to me whether allowing it or disallowing it is more "right".

Can anyone else chime in?

In the meantime, I filed your question as 
https://subversion.apache.org/issue/4890

- Julian



Re: svn commit: r1898378 - in /subversion/trunk/subversion: include/ libsvn_client/ libsvn_wc/ svn/ tests/cmdline/getopt_tests_data/

2022-03-02 Thread Julian Foad
> Output will be, I suggest, like this:
> 
> * WC format 31, compatible with Subversion v1.8 and newer
> * WC format 32, compatible with Subversion v1.15 and newer

r1898524.


Re: Creating future-version wc's

2022-03-02 Thread Julian Foad
Julian Foad wrote:
> Daniel Shahaf wrote:
>> I expected this to error out hard, since we can't create 1.16-format

> Agreed, that seems better. Will do.

r1898523.

- Julian



Re: Creating future-version wc's

2022-03-01 Thread Julian Foad
Daniel Shahaf wrote:
> [...] % svn co file://$PWD/r --compatible-version=1.16 
> [...] svn: warning: W27: Creating working copy version 1.15 instead
> [...]
> I expected this to error out hard, since we can't create 1.16-format
> wc's.  (If pristines-on-demand is merged, 1.15 will use f32 but 1.16 may
> use f33.)  Compare:
> 
> [[[
> % svnadmin create --compatible-version=1.16 r
> ./subversion/svnadmin/svnadmin.c:3213: (apr_err=SVN_ERR_UNSUPPORTED_FEATURE)
> svnadmin: E27: Cannot guarantee compatibility beyond the current
> running version (1.15.0)
> zsh: exit 1 svnadmin create --compatible-version=1.16 r
> ]]]

Agreed, that seems better. Will do.

- Julian



Re: svn commit: r1898389 - in /subversion/trunk/subversion: include/svn_client.h libsvn_client/upgrade.c svn/info-cmd.c svn/svn.c

2022-03-01 Thread Julian Foad
Daniel Shahaf wrote:
>> +  case SVN_WC__WC_NG_VERSION: return _1_7;
> 
> SVN_WC__WC_NG_VERSION is declared in wc.h, which this file isn't allowed
> to use.

Ack. Similar cases mentioned elsewhere; I'll fix the private include
later as part of API public/private changes.

> Why does this function special-case f12 over all other format numbers

I don't know. I copied it from the existing
'svn_wc__version_string_from_format()'. It seems to me it is legacy (for
developers around 1.7 era) and should be removed from both. I'll remove
it from this one at least.

r1898510.

>> +{ SVN__STATIC_STRING("wc-compatible-version"),
>> +  info_item_wc_compatible_version },
> 
> The trailing comma is not C89-compatible and used to break the Windows
> build.

We use a trailing comma in arrays in Subversion. I think it can stay.

- Julian



Re: A two-part vision for Subversion and large binary objects.

2022-03-01 Thread Julian Foad
Summary of status of #525
=

Currently on the 'pristines-on-demand-on-mwf' branch.

Dev tasks in progress or outstanding:
--

* Multi-WC-format dependency (https://subversion.apache.org/issue/4883):

- is merged to trunk and reviewed;

- some outstanding items from review like UI tweaks, API
private-vs-public choices (in dev@ emails, and #4884 through #4887);

- some docs probably needed (release notes, help text).

* Per-WC config (https://subversion.apache.org/issue/4889):

- Quotes from #4889: "not strictly needed for MVP... MIGHT be good
to add the low level flagging mechanism now... not clear which would be
more effort."

- Last thing I wrote on 2022-02-16 [1]: "I do think we need an
explicit option to enable the feature by name, not just a WC version
number. I haven't yet worked out whether it must also be possible to
upgrade to 1.15 format without enabling the feature, and thus need to
store the feature-enable flag in the WC somewhere separate from the
format version number. For future developments of other wc features,
that will be needed; I just haven't finalised yet if it's essential for
MVP. Might be, in order to not cause compatibility issues for those
future scenarios."

* Issues arising in existing regression tests (#4888 and others):

- authz denied during textbase sync (#4888) -- in progress [2]

- about 12 other tests that were disabled or modified -- I have
started investigating and patching; need some further attention.


Community tasks outstanding:


* initiate a merge to trunk

* decide on a name for the feature


- Julian

[1] dev@ thread "A two-part vision for Subversion and large binary objects."
[2] dev@ thread "Pristines-on-demand: authz denied during textbase sync"



Re: A two-part vision for Subversion and large binary objects.

2022-03-01 Thread Julian Foad
On Feb 18 2022, Mark Phippard wrote:
>> [It fetches and stores pristines of modified files;] it doesn't mean
>> "store no pristines" in that WC.
> 
> I am curious what Karl thinks given that he is living this scenario
> today and wants the feature. I would think that having update create
> pristines for any modified file taints its usefulness. That said, it
> is probably still better than what they have today and if the user is
> on a fast network and disk space is not too big of an issue it might
> not matter too much. I personally think this is the biggest issue to
> solve though, more so than selectively choosing pristines for
> different files. I think the feature just really does not "work as
> advertised" if it is going to behave this way.

Hello, Mark. Maybe Karl will yet answer, but I didn't want to leave this
hanging any longer.

This design was anticipated as far back as a 2006-06-09 comment on #525
by Oswald Buddenhagen [1], where it is described as one of the
possibilities among variations and alternatives. I'm not saying that
justifies choosing it as the best solution, just that it's not arriving
now from off the radar.

We've already discussed how there are certainly scenarios where it won't
be greatly helpful as well as scenarios where it will, and several
people seem to think there are enough of the latter.

Maybe, don't knock it till you've tried it?

As for "as advertised", then surely the take-away message is we need to
be careful how we describe it; never call it "without pristines".

- Julian


[1] 
https://issues.apache.org/jira/browse/SVN-525?focusedCommentId=14911121=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14911121




Re: svn commit: r1898378 - in /subversion/trunk/subversion: include/ libsvn_client/ libsvn_wc/ svn/ tests/cmdline/getopt_tests_data/

2022-02-28 Thread Julian Foad
Thanks for the reviews.

Daniel Shahaf wrote:
> Missing @since. [...]

r1898461.

>> +#include "../libsvn_wc/wc.h"
> 
> I don't think libsvn_client is allowed to use this header.

I'll fix that later.


>> +* compatible with Subversion v1.8 to v1.15 (WC format 31)
>> +* compatible with Subversion v1.15 (WC format 32)
> 
> Bikeshed, but the bullet point parses as a sentence fragment.  Suggest
> instead:
> 
> * format 31, compatible with Subversion v1.8 to v1.15

I had wanted to present the (minimum) compatible-version as the primary
info and the WC format number as secondary. However, it does indeed work
better that way. Also the "to 1.15" is implicit in being supported by
1.15 (unless we were to have disjoint support ranges, but the reader
isn't likely to assume that).

Continuing that way, I can then simplify that "get supported versions
list" API, first removing the "max supported version" from that struct,
and then by using another function to convert from wc-format number to
min-supported version, we can get rid of that struct entirely. Much
cleaner. Return just a list of wc-format ints. I'll do that soon.

Output will be, I suggest, like this:

* WC format 31, compatible with Subversion v1.8 and newer
* WC format 32, compatible with Subversion v1.15 and newer

- Julian



Re: [PATCH] Sketch of per-user/per-wc config for pristines-mode (#4889)

2022-02-25 Thread Julian Foad
Filed as #4889, https://subversion.apache.org/issue/4889,
"Pristines-on-demand: per-WC config"



Re: Pristines-on-demand: authz denied during textbase sync (#4888)

2022-02-25 Thread Julian Foad
Filed as issue #4888, https://subversion.apache.org/issue/4888

I will come back to it soonish.



Issue #4883: Multi-WC-format

2022-02-24 Thread Julian Foad
This feature is now filed as 

https://subversion.apache.org/issue/4883 "multi-wc-format"


Re. the four enhancement suggestions I listed earlier in this thread:

> - The "svn upgrade" command may grow an option to only show the
> current and target working copy versions.

Added as a comment to #4885 "multi-wc-format: WC upgraded and
not-upgraded notifications".


> - Let the default "compatible version" be specified in user config.

Filed as:

https://subversion.apache.org/issue/4886 "multi-wc-format: config for
default WC version for checkout & upgrade"


> - Clarify/unify the option name (--compatible-version/--wc-format-version/...)

Filed as:

https://subversion.apache.org/issue/4887 "multi-wc-format: clarify/unify
option names for compatible-version"


> - Clarify the supported versions display (in svn --version)

Improved in https://svn.apache.org/r1898378, along with necessary API changes:

  Old display in 'svn --version':

  | Supported working copy (WC) versions: from 1.8 to 1.15

  New display in 'svn --version':

  | Supported working copy (WC) formats:
  |
  | * compatible with Subversion v1.8 to v1.15 (WC format 31)
  | * compatible with Subversion v1.15 (WC format 32)


- Julian



Re: multi-wc-format review

2022-02-24 Thread Julian Foad
Daniel,

Thank you for reviewing multi-wc-format.

I think it would be good for us to be able to refer to the issue
​tracker. I'll file a 'multi-wc-format' issue and we can add sub-issues
​for changes we need or want.

https://subversion.apache.org/issue/4883 "multi-wc-format"


Daniel Shahaf wrote:
> 1. How can a user ask a working copy what range of minor versions it
> supports?  Cf. "Compatible With Version:" in `svnadmin info` output.
> [...]

I agree with all of that, and it seems like something we should do
​before releasing the feature. Filed, quoting your comments, as:

https://subversion.apache.org/issue/4884 "multi-wc-format: user
​visibility of WC version"


> 2. The comment in trunk just above STMT_UPGRADE_TO_32 is just an
> ellipsis.  It should be fleshed out.  Would it be correct to change it
> to say "Format 32 is identical to format 31"? [...]

Yes. Done in r1898364.


> 3. «svn upgrade» of an f31 working copy does nothing and prints nothing.
> I see this is deliberate behaviour (by the docstring of 
> SVN_WC__DEFAULT_VERSION).
> 
> 3.1. Should «svn upgrade» upgrade to the newest supported format?
> That's what «svnadmin upgrade» does and what «svn upgrade» has done to
> date, so users may expect this.

Given how Subversion has matured, and how we are aware of the
​difficulties that WC version upgrades cause for users of multiple
​clients, we are now deliberately choosing compatibility as the default,
​now that we have the ability to do so.

> 3.2. Failing that, should «svn upgrade» print an info message to the
> effect of "I upgraded your wc to f31 but I can also upgrade it to f32 [...]?

Good idea. Filed as:

https://subversion.apache.org/issue/4885 "multi-wc-format: WC upgraded
​and not-upgraded notifications"

with a short reply.


> 4. Since wc-queries-test.c special-cases STMT_UPGRADE_TO_32,
> wc-metadata.sql should grow comments pointing to the places outside
> wc-metadata.sql itself that need to be changed when a creating a new
> format.

Done in r1898365.


> 5. We should probably keep a list somewhere of what set of «make check
> WC_FORMAT_VERSION=%s» values is needed in order to cover all
> currently-supported codepaths.

A good idea in principle. Ideally, we should probably keep a
​machine-readable list of all test parameters, along with which subset we
​recommend for different purposes such as pre-commit checks, backport
​proposal checks, release candidate checks.

I do not see a good place to put this right now. The closest thing we
​have already might be 'subversion/tests/README' or
​'tools/dev/unix-build' but neither of these enumerates the test parameters.


> Sorry for not getting to this before the merge.

That's no problem.

- Julian


Re: Pristines-on-demand: authz denied during textbase sync

2022-02-19 Thread Julian Foad
Thanks, Daniel. I'll take a look at this some time in the next few days unless 
someone beats me to it.

Daniel Shahaf wrote:
> [...] What race condition is that?  The change to authz happened before the
>sync, not between the sync and the main operation.

Yes, it's not directly related to this; it's just that seeing this problem set 
me to wondering whether another class of race condition could occur between the 
sync and the main operation. User modifies a file in that window and... an 
operation like diff would error out in a harmless way, but maybe some 
operations might break in a non-trivial way, e.g. leaving the wc locked or 
worse? I suppose such scenarios must be equivalent to what can already happen 
(in svn 1.14) if a pristine is deleted without svn's knowledge.
- Julian


Re: A two-part vision for Subversion and large binary objects.

2022-02-18 Thread Julian Foad
Karl Fogel wrote: 
> Is the above happening in MVP? 

Yes. I was describing what Evgeny created last year in the 
'pristines-on-demand' branch.

> I ask because my understanding of 
> MVP was that it's not doing this opportunistic fetching/discarding 
> of bases, but rather that it's a simple per-WC setting that means 
> "store no pristines in this WC" and that's that.

MVP is what Evgeny created, but selectively turned on, or not, per WC. When 
turned on, it doesn't mean "store no pristines" in that WC.

I'm thinking two things would make the explanation more accessible.

1. Docs: release notes is a good start (thanks Nathan) but somewhere more (svn 
help?) too.

2. Feedback: svn should print progress notifications (maybe gated on 
'--verbose'), that make clear when and what it is doing with the pristines.

Number 2 would make a great volunteer contribution if anyone's willing to dip 
in to the branch code. It's just a matter of extending our existing notifier 
callback and making the hydrate/dehydrate call back to it.
- Julian


Re: Pristines-on-demand: authz denied during textbase sync

2022-02-18 Thread Julian Foad
I committed the proposed fix in r1898184:
[[[
On the 'pristines-on-demand-on-mwf' branch: fix authz_tests.py 31.

This test runs an 'update' after read access permission has been removed
from a repository path. The text base sync code errored out when it
tried to download this text base.

We fix it by making the text base sync code ignore authorization errors,
leaving any such base unfetched and continuing to try the others.
...
* subversion/libsvn_wc/wc_db_textbase.c
  (svn_wc__db_textbase_sync): If read access is unauthorized, ignore the
error and continue even though we failed to fetch the textbase.
]]]

- Julian


Re: Multi-WC-format branch: preparing for merge to trunk

2022-02-18 Thread Julian Foad
Merged to trunk now. Testers, enhancers welcome. More below.

Please try it out if you haven't.

The basics:

wc format 31:
  - compatible with svn 1.8 to 1.14 and now 1.15-dev
wc format 32:
  - compatible with svn 1.15-dev only,
  - as yet introduces no schema changes (we expect those to come when we
merge 'pristines-on-demand-on-mwf' branch to trunk)

There is no supported downgrade, so be careful with testing this. In
this development state you might not want to upgrade an important wc to
compatible-version=1.15. Check out a disposable WC instead.

(In emergency you might get away with:
sqlite3 .svn/wc.db 'pragma user_version=31;'
)

Commands affected:

svn checkout  # WC is compatible with 1.8 to 1.14
svn checkout --compatible-version=1.15  # format bump
svn upgrade  # upgrade an old (svn<=1.7) WC to f31 (1.8~1.14)
svn upgrade --compatible-version=1.15  # upgrade to 1.15


Anyone would be very welcome to take a crack at any suggested
enhancements, including but not limited to those I listed earlier:

>  - The "svn upgrade" command may grow an option to only show the current
>and target working copy versions. (Perhaps `svn upgrade --dry-run`?)
> 
>  - Let the default "compatible version" be specified in the user
> config. (That is, the default version for 'checkout' and 'upgrade',
> which are the places where a version can be specified.)
> 
>  - Clarify/unify the option name. svn commands also have other 
>compatibility concerns besides the WC format.
>Current option names:
>  --compatible-version  # for 'svnadmin': repository version
>  --compatible-version  # for 'svn': WC version
>  --wc-format-version  # for the tests: WC version
>Suggestions for WC version everywhere ('svn' and tests):
>  --wc-compatible-version
>  --wc-compatible-format
> 
>  - Clarify the supported versions display:
>  -Supported working copy (WC) versions: from 1.8 to 1.15
>  +Supported working copy (WC) versions:
>  +  --compatible-version=1.8 (supported by svn 1.8 to 1.15; default)
>  +  --compatible-version=1.15 (supported by svn 1.15)
> 
> We might want some of these, or other changes, before release.

- Julian



Re: A two-part vision for Subversion and large binary objects.

2022-02-18 Thread Julian Foad
Mark Phippard wrote:
>> Update starts by hydrating. That means it WILL download any missing
>> pristines of modified files, regardless whether any newer revision
>> will be found.
> 
> Does the possibility exist to optimize this at all? [...]

To understand, we need to recap that this design is based around a
simple invariant: whenever a file is seen to be locally modified, at the
next convenient opportunity we will download its base; and when seen to
be not-modified we will discard its base. It is not a
fetch-at-point-of-access design.

This design uses a "wrapper" structure, localising the hydration and
dehydration steps at the top level of subcommands, outside the command
logic, not inside the command logic. It does not know which text bases
the command will need, and instead works on the principle that it will
fetch all that might be needed. Evgeny and others suggested that
localizing the network access to the start of each subcommand was better
(for client software and users) than introducing the possibility for
network access to be required at arbitrary points inside the command logic.

The rationale is that in the design use cases, fetching is acceptably
cheap both in network speed and in the availability of sufficient
storage space for text bases for the files that are locally modified.

Possibilities for optimisation may exist but the kind of optimisation
you are aksing about would require a different design, one in which the
fetcher knows what is really needed by the current operation, so one
where the fetch is pushed down in the logic to nearer the point of
access. There is limited scope for it in this design.

Personally I would like us to explore the fetch-at-point-of-access
design alternative. While introducing one kind of complexity (network
access requested at arbitrary points during a command) I feel it would
reduce the kind of complexity we're discussing here (unnecessary fetches
and consequent attempts at optimisation). But that's not what we're
exploring right now.

I asked about this in this thread a few weeks ago; you could see there
for further discussion. (I tried to dig up a link but having trouble
finding myself in the archives.)

- Julian



  1   2   3   4   5   6   7   8   9   10   >