On 27 October 2015 at 04:52, Johan Corveleyn <jcor...@gmail.com> wrote: > Bert, > > As the OP of this mail-thread, which spun out of the discovery of a > loss of information by 'dump' in 1.9 [1], I'd like to add some things. > > I found out about this problem during the Berlin hackathon, when I > tested various dumped/loaded repositories. The loss of information is > real, and is IMO significant (we're losing a, possibly intended, > relationship between a log message and a particular path [2]). > > On Mon, Oct 26, 2015 at 6:16 PM, Bert Huijben <b...@qqmail.nl> wrote: >> >> >>> -----Original Message----- >>> From: Evgeny Kotkov [mailto:evgeny.kot...@visualsvn.com] >>> Sent: maandag 26 oktober 2015 17:45 >>> To: Bert Huijben <b...@qqmail.nl>; Stefan Fuhrmann >>> <stefan.fuhrm...@wandisco.com> >>> Cc: Johan Corveleyn <jcor...@gmail.com>; Julian Foad >>> <julianf...@btopenworld.com>; dev <dev@subversion.apache.org> >>> Subject: Re: No-op changes no longer dumped by 'svnadmin dump' in 1.9 >> >> >>> This means that after r1572363 and r1573111, svn_ra_get_file_revs2() and >>> svn_repos_get_file_revs2() were skipping some of the "interesting" >>> revisions, >>> according to the FS API defining the concept. Moreover, this behavior could >>> be inconsistent even within a single function like svn_ra_get_file_revs2() >>> that calls svn_ra_get_log2() for old servers, as get-log notices revisions >>> with empty deltas. >>> >>> I think that it's another example of where r1572363 and r1573111 introduce >>> an >>> inconsistent and unwanted behavior change. >> >> And 1.9.x assumes that the old behavior is a bug... and in many cases I >> agree. > > Did the old behavior have serious bugs that were visible to users? > Evgeny seemed to think not [3], and no-one said otherwise. And even > so: is it okay to introduce new bugs while fixing old ones? IMO a bug > in 'dump' is a big deal, because it changes your repository / history. > Moreover, people who do a dump/load might not notice the change until > years later, after they have piled up tons more history on top of it. > > Maybe there is some doubt whether this is a bug or a feature, but > while we're in doubt I think the safest option is the 0-option: keep > the old and known behavior (or rollback to it), which didn't lose this > information. > >> This is exactly where the document Julian wrote comes in. > > As I said earlier in this thread, I think that document [4] is a great > effort. But if you read it carefully, you'll see it does not > contradict having no-op changes in the repository history, and > exposing them for instance through 'svn log'. If we're supporting > those (and we have until 1.8), we must be able to dump them. > > See specifically the final section titled "ARBITRARY DIFF VS. > SINGLE-REVISION CHANGE ", where Julian argues: > > [[[ > As best I understand it, the idea of recording a no-op-change is meaningful > and relatively straightforward to define at the level of a single state > transition. We think of a commit as such a transition, and it is, but as > mentioned above it's not in general the exact same transition that the > client described. > > Attempting to derive a notion of 'no-op-change' that applies to a > difference taken between an arbitrary pair of points in the version > history, on the other hand, is not at all straightforward, and we do not > have a concept of its meaning in relation to merging and so on. > > Now, the "svn log -v" output clearly applies to a single commit, a single > state transition, and thus we find the indication of no-op changes there to > be somewhat satisfactory. The code that generates this output, on the other > hand, uses APIs that compare arbitrary points in history, such as > > svn_fs_contents_changed(root1:path1, root2:path2) > svn_fs_props_changed > > Comparing arbitrary points in history is an operation that, throughout > pretty much all of the version control system, is used really only when we > want and need to know about real changes. Hence the definition of a new > pair of APIs, > > svn_fs_contents_different > svn_fs_props_different > > to specifically provide that meaning. > > What purpose remains for the original _changed() APIs, then? At first it > wasn't clear there was any real purpose, but if we want "svn log" etc. to > continue as before, then we need something like them. Except for this > purpose we don't need APIs that compare two arbitrary states; we need APIs > that compare two successive states, because this 'touched' concept only > makes sense in this context. > ]]] > > Whether we go for a complete redesign of the APIs or not, the above > text nicely explains some different ways of looking at this, and gives > "no-op changes" a place in our feature set. > >> If we wanted 1.9.x to behave in all ways identical to 1.8.x, we wouldn't >> have created 1.9. We would have never released something different than >> the old thing. Stefan spend quite some time in improving things, and >> upto now most users agreed that this was an improvement. (The time to >> speak up was during the release candidates) >> >> >> Every new feature or bugfix changes behavior. >> Just 'thinking that this is another inconsistent behavior change' doesn't >> make a new argument on why this behavior change should be backported to >> 1.9.x. >> > > In my opinion the changed behavior of dump is a bug, not just a > behavior change. Unfortunately, I only found the bug after release. > But even if you don't think it's a bug, it was definitely an > unexpected side-effect of the refactoring done by stefan2. > > Stefan proposed another way of fixing this, different from Evgeny's > patch, but both agreed that the dump behavior was a bug and that it > should be fixed. Julian too agreed that the change (the new code) > should be reverted [5]. > >> >> I don't think reporting something as changed, when it is clearly not >> changed is a good thing. >> >> We should decide when we want to see something as 'changed' and what >> definition of 'changed' should be used where. >> >> Just going back to 1.8 is not the way to approach this. >> That just changes one 'somehow broken implementation' (in one definition) >> with a 'somehow broken implementation' (with a different definition). >> We should define what behavior we really want (where)... and document why >> we want that behavior there. Until then I don't think we should backport >> anything. >> >> Both the 1.8.x behavior and the 1.9.x behavior are released... Going back >> to 1.8.x is not going to fix everybody's usecases. > > Okay, well, I agree we should eventually go for a clear specification > and design, and then implement that. But in the meantime we have a > real bug in 1.9 [1] which can cause damage (or in any case "doubtful > changes") to repositories (when an admin performs a dump+load). I > would prefer that we try to fix the dump bug and backport it as soon > as possible (getting us back to a good working state), and then take > time to work out the long term solution. > > +1 on all above.
-- Ivan Zhakov