Re: SHA-1 collision in repository?

2018-02-26 Thread Branko Čibej
On 26.02.2018 22:41, Myria wrote:
> -bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
> hash='db11617ef1454332336e00abc311d44bc698f3b3'"
> db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680
>
> The line from the grep -a command containing that hash is below.  They
> all match.
> text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
> db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13
>
>
> In other news, unknown whether related to the current problem, my
> attempt to clone the repository to my local computer is failing:
>
> D:\>svnsync sync file:///d:/svnclone
> Transmitting file data
> .svnsync:
> E16: SHA1 of reps '227170 153 193 57465
> bb52be764a04d511ebb06e1889910dcf
> e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
> 193 57465 bb52be764a04d511ebb06e1889910dcf
> e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
> (e6291ab119036eb783d0136afccdb3b445867364) but contents differ
> svnsync: E160004: Filesystem is corrupt
> svnsync: E200014: Checksum mismatch while reading representation:
>expected:  bb52be764a04d511ebb06e1889910dcf
>  actual:  80a10d37de91cadc604ba30e379651b3
>
> This is odd, because revision 227185 (the revision it's trying to
> commit) verifies fine on the originating server:
>
> -bash-4.1$ sudo svnadmin verify -r227170 /srv/subversion/repositories/meow
> * Verifying repository metadata ...
> * Verifying metadata at revision 227170 ...
> * Verified revision 227170.
> -bash-4.1$ sudo svnadmin verify -r227185 /srv/subversion/repositories/meow
> * Verifying repository metadata ...
> * Verified revision 227185.


It is a very, *very* bad idea to perform any operations on the
repository as root! You should not have to do that.

Please check file ownership and permission throughout the repository;
none of the files should be owned by root.

-- Brane


> On Fri, Feb 23, 2018 at 5:42 PM, Philip Martin  
> wrote:
>> Philip Martin  writes:
>>
>>> There are a couple of options:
>>>
>>>   A) disable rep-caching by editing fsfs.conf inside the repository
>>>
>>>   B) reset the mapping by deleting/renaming the file db/rep-cache.db
>>>  inside the repository (but please rename rather than delete if you
>>>  want to help us identify the corruption)
>>>
>>> Doing either of these should allow the commit to succeed.
>> To verify the corruption start with the rep-cache:
>>
>>   sqlite3 db/rep-cache.db "select * from rep_cache where 
>> hash='db11617ef1454332336e00abc311d44bc698f3b3'"
>>
>> That should give you five numbers: the hash, the revision (604440), the
>> offset, the size and the expanded size.
>>
>> Then examine the revision file for r604440.  It could be unpacked:
>>
>>   grep -a "^text: 604440.*/_" db/revs/604/604440
>>
>> or packed:
>>
>>   grep -a "^text: 604440.*/_" db/revs/604.pack/pack
>>
>> One of the lines from grep should contain the hash and that line should
>> start:
>>
>>   text: 604440
>>
>> followed by three more numbers then hashes and other stuff.  The three
>> numbers are the offset, size and expanded size and should match the
>> values from the rep-cache but I suspect the rep-cache has the wrong
>> offset.
>>
>> --
>> Philip



Re: Show textual diff in a moved/copied file - how?

2018-02-26 Thread Alexey Neyman

On 02/26/2018 01:49 AM, Stefan Sperling wrote:

And, I find it quite counter-intuitive. I would expect --notice-ancestry at
least to take ancestral relationship between these files into account;

(I don't have time to look at the code right now, so I'm speculating a bit.)
You're diffing *directories*, not files. There are separate client-side
handlers for directory and file diffs which might not always have the same
information available. E.g. it may not be feasible to trace the back the
copy history of every child when diffing two directories.
I am not that familiar to say why 'svn diff' behaves in the way it does, 
but it does look like it's contradicting the description in 'svn help diff':


  --notice-ancestry    : diff unrelated nodes as delete and add

Since 'svn diff' does not take the opposite option, '--ignore-ancestry', 
I'd say one would assume that 'svn diff' should diff *related* nodes 
textually, not *as delete and add*. Tracing each child may take some 
additional time, right, but between "fast and wrong" and "slow and 
correct" behaviors, I'd choose the latter :)

Since you know all paths and revisions involved, you could also run:
svn diff ^/foobar@1 ^/barfoo@2

Well, either of these approaches is not very convenient when there is a
dozen moves & modifications in a single revision.

Agreed. At least the file diffs allows you to 'zoom in', but it would
be much better if there was a way to get the diff you want to see
with just one command.
If backwards compatibility of 'svn diff' behavior, or the performance 
impact of tracing every child, is a concern - is it possible to have 
'svn diff' do such history tracing if enabled by some new option?


Although, I cannot come up with a better name than 'svn diff 
--properly-diff-related-nodes'.

Besides, the former (just passing the path) does not seem to work in all
cases. In the real repository, I have two revisions that did the same thing:
moved a directory and modified some files in the moved directory. The trick
with passing the path to the file works for one of them, but not for the
other - and I am at a loss why SVN treats these two differently. Here's
where diff does not display the proper diff even when supplied with the path
to the file:

[... snip ...]

I can't explain this one. It might be worth filing an issue about
this problem in case you can come up with a standalone recipe to
reproduce it.
I found what triggers this behavior. This happens when the source of the 
copy is not the revision immediately preceding the revision being diffed.


Here's the script for reproduction:

---8<---
#!/bin/bash

r=`pwd`/foo-svn
url=file://$r
wc=`pwd`/foo-wc
rm -rf $r $wc
svnadmin create $r
svn co $url $wc
cd $wc
echo "Initial content" > foo
svn add foo
svn ci -m "Initial import"

# Source revision to be used in copy later
srev=`svnlook youngest $r`

if [ "$INSERT_EXTRA_REVISION" = "yes" ]; then
    svn mkdir somedir
    svn ci -m "Extra revision"
fi

svn cp $url/foo@$srev bar
echo "Added line" >> bar
svn ci -m "Copy + modify"

cmrev=`svnlook youngest $r`
svn diff -c $cmrev $url/bar@$cmrev
---8<---

And here is the output from the script:

---8<---
$ ./test.sh
...
Index: foo
===
--- foo    (.../foo)    (revision 1)
+++ foo    (.../bar)    (revision 2)
@@ -1 +1,2 @@
 Initial content
+Added line
$ INSERT_EXTRA_REVISION=yes ./test.sh
...
Index: bar
===
--- bar    (nonexistent)
+++ bar    (revision 3)
@@ -0,0 +1,2 @@
+Initial content
+Added line
---8<---

Why is the behavior different in these cases? Isn't that 
counter-intuitive as well that the diff's output depends on the source 
revision of the copy?


Regards,
Alexey.


Re: SHA-1 collision in repository?

2018-02-26 Thread Myria
-bash-4.1$ sqlite3 rep-cache.db "select * from rep_cache where
hash='db11617ef1454332336e00abc311d44bc698f3b3'"
db11617ef1454332336e00abc311d44bc698f3b3|604440|34|134255|136680

The line from the grep -a command containing that hash is below.  They
all match.
text: 604440 34 134255 136680 c9f4fabc4d093612fece03c339401058
db11617ef1454332336e00abc311d44bc698f3b3 604439-cyqm/_13


In other news, unknown whether related to the current problem, my
attempt to clone the repository to my local computer is failing:

D:\>svnsync sync file:///d:/svnclone
Transmitting file data
.svnsync:
E16: SHA1 of reps '227170 153 193 57465
bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' and '-1 0
193 57465 bb52be764a04d511ebb06e1889910dcf
e6291ab119036eb783d0136afccdb3b445867364 227184-4vap/_4o' matches
(e6291ab119036eb783d0136afccdb3b445867364) but contents differ
svnsync: E160004: Filesystem is corrupt
svnsync: E200014: Checksum mismatch while reading representation:
   expected:  bb52be764a04d511ebb06e1889910dcf
 actual:  80a10d37de91cadc604ba30e379651b3

This is odd, because revision 227185 (the revision it's trying to
commit) verifies fine on the originating server:

-bash-4.1$ sudo svnadmin verify -r227170 /srv/subversion/repositories/meow
* Verifying repository metadata ...
* Verifying metadata at revision 227170 ...
* Verified revision 227170.
-bash-4.1$ sudo svnadmin verify -r227185 /srv/subversion/repositories/meow
* Verifying repository metadata ...
* Verified revision 227185.

On Fri, Feb 23, 2018 at 5:42 PM, Philip Martin  wrote:
> Philip Martin  writes:
>
>> There are a couple of options:
>>
>>   A) disable rep-caching by editing fsfs.conf inside the repository
>>
>>   B) reset the mapping by deleting/renaming the file db/rep-cache.db
>>  inside the repository (but please rename rather than delete if you
>>  want to help us identify the corruption)
>>
>> Doing either of these should allow the commit to succeed.
>
> To verify the corruption start with the rep-cache:
>
>   sqlite3 db/rep-cache.db "select * from rep_cache where 
> hash='db11617ef1454332336e00abc311d44bc698f3b3'"
>
> That should give you five numbers: the hash, the revision (604440), the
> offset, the size and the expanded size.
>
> Then examine the revision file for r604440.  It could be unpacked:
>
>   grep -a "^text: 604440.*/_" db/revs/604/604440
>
> or packed:
>
>   grep -a "^text: 604440.*/_" db/revs/604.pack/pack
>
> One of the lines from grep should contain the hash and that line should
> start:
>
>   text: 604440
>
> followed by three more numbers then hashes and other stuff.  The three
> numbers are the offset, size and expanded size and should match the
> values from the rep-cache but I suspect the rep-cache has the wrong
> offset.
>
> --
> Philip


Re: Show textual diff in a moved/copied file - how?

2018-02-26 Thread Johan Corveleyn
On Mon, Feb 26, 2018 at 10:49 AM, Stefan Sperling  wrote:
> On Mon, Feb 26, 2018 at 12:43:42AM -0800, Alexey Neyman wrote:
>> On 02/26/2018 12:18 AM, Stefan Sperling wrote:
>> > On Sun, Feb 25, 2018 at 11:38:03PM -0800, Alexey Neyman wrote:
>> > > Hi all,
>> > >
>> > > I am trying to dig for some changes in a file that was moved a few times 
>> > > and
>> > > 'svn diff' shows full "remove old location and add new location as if it
>> > > were a new file" diffs, which are not helpful. Is there a way to make the
>> > > diff show the changes as compared against the origin of the copy? I tried
>> > > --notice-ancestry, does not help.
>> > Diff output changes depending on whether you pass a path to the
>> > file itself or to a parent of the file. Try: svn diff -c 2 barfoo
>> > I found this in the diff_renamed_file() test in diff_tests.py,
>> > see there for more examples.
>> > https://svn.apache.org/repos/asf/subversion/trunk/subversion/tests/cmdline/diff_tests.py
>> You don't expect the end-user to read the test cases in the product to get
>> these subtleties, do you? :)
>
> No, I don't. But subtle details such as this are often not documented.
> In documentation there is always a trade-off between what the system
> is actually doing in detail and what the reader really needs to know.
>
> The test cases are an accurate source of reference when it comes to
> details of expected behaviour like this because they encode what's
> actually intended.
>
>> And, I find it quite counter-intuitive. I would expect --notice-ancestry at
>> least to take ancestral relationship between these files into account;
>
> (I don't have time to look at the code right now, so I'm speculating a bit.)
> You're diffing *directories*, not files. There are separate client-side
> handlers for directory and file diffs which might not always have the same
> information available. E.g. it may not be feasible to trace the back the
> copy history of every child when diffing two directories.
>
>> > Since you know all paths and revisions involved, you could also run:
>> >svn diff ^/foobar@1 ^/barfoo@2
>> Well, either of these approaches is not very convenient when there is a
>> dozen moves & modifications in a single revision.
>
> Agreed. At least the file diffs allows you to 'zoom in', but it would
> be much better if there was a way to get the diff you want to see
> with just one command.
>
>> Besides, the former (just passing the path) does not seem to work in all
>> cases. In the real repository, I have two revisions that did the same thing:
>> moved a directory and modified some files in the moved directory. The trick
>> with passing the path to the file works for one of them, but not for the
>> other - and I am at a loss why SVN treats these two differently. Here's
>> where diff does not display the proper diff even when supplied with the path
>> to the file:
>>
>> # The relevant fragment of a revision
>> $ svn log -c 36 -v file://`pwd`/XX-svn
>>A /trunk/XX/src/bin/more (from /vendor/:29)
>>M /trunk/XX/src/bin/more/more.c
>> # Passing the path to the directory that was copied: does not work
>> $ svn di -c 36 file://`pwd`/XX-svn/trunk/XX/src/bin/more | grep -A 4
>> 'Index: more.c'
>> Index: more.c
>> ===
>> --- more.c  (nonexistent)
>> +++ more.c  (revision 36)
>> @@ -0,0 +1,1894 @@
>> # Passing the path to the specific file: does not work
>> $ svn di -c 36 file://`pwd`/XX-svn/trunk/XX/src/bin/more/more.c |
>> grep -A 4 'Index: more.c'
>> Index: more.c
>> ===
>> --- more.c  (nonexistent)
>> +++ more.c  (revision 36)
>> @@ -0,0 +1,1894 @@
>> # Manual, file-by-file: works, but doesn't scale to revisions with lots of
>> modifications
>> $ svn di 
>> file://`pwd`/los178-svn{/vendor//more.c@29,/trunk/X/src/bin/more/more.c@36}
>> | grep -A 4 'Index: more.c'
>> Index: more.c
>> ===
>> --- more.c  (.../vendor/BSD/more/4.3Tahoe/more.c)   (revision 29)
>> +++ more.c  (.../trunk/los178/src/bin/more/more.c)  (revision 36)
>> @@ -1,3 +1,11 @@
>
> I can't explain this one. It might be worth filing an issue about
> this problem in case you can come up with a standalone recipe to
> reproduce it.

I remembered we had a similar discussion (also on the different
behaviour of 'svn diff' vs. 'svnlook diff') on dev@ some years ago.
It's a long thread with lots of info in it. I don't have time to
refocus / summarize this now, so I'm just dropping this link here from
where I think the thread starts to become interesting:

https://svn.haxx.se/dev/archive-2013-06/0621.shtml

It also refers to an older post where I highlighted the difference
between 'svnlook diff' has --diff-copy-from' and 'svn diff
--show-copies-as-adds' (which sounds like the reverse option, so 'svn
diff' sounds like 

Re: Show textual diff in a moved/copied file - how?

2018-02-26 Thread Stefan Sperling
On Mon, Feb 26, 2018 at 12:43:42AM -0800, Alexey Neyman wrote:
> On 02/26/2018 12:18 AM, Stefan Sperling wrote:
> > On Sun, Feb 25, 2018 at 11:38:03PM -0800, Alexey Neyman wrote:
> > > Hi all,
> > > 
> > > I am trying to dig for some changes in a file that was moved a few times 
> > > and
> > > 'svn diff' shows full "remove old location and add new location as if it
> > > were a new file" diffs, which are not helpful. Is there a way to make the
> > > diff show the changes as compared against the origin of the copy? I tried
> > > --notice-ancestry, does not help.
> > Diff output changes depending on whether you pass a path to the
> > file itself or to a parent of the file. Try: svn diff -c 2 barfoo
> > I found this in the diff_renamed_file() test in diff_tests.py,
> > see there for more examples.
> > https://svn.apache.org/repos/asf/subversion/trunk/subversion/tests/cmdline/diff_tests.py
> You don't expect the end-user to read the test cases in the product to get
> these subtleties, do you? :)

No, I don't. But subtle details such as this are often not documented.
In documentation there is always a trade-off between what the system
is actually doing in detail and what the reader really needs to know.

The test cases are an accurate source of reference when it comes to
details of expected behaviour like this because they encode what's
actually intended.

> And, I find it quite counter-intuitive. I would expect --notice-ancestry at
> least to take ancestral relationship between these files into account;

(I don't have time to look at the code right now, so I'm speculating a bit.)
You're diffing *directories*, not files. There are separate client-side
handlers for directory and file diffs which might not always have the same
information available. E.g. it may not be feasible to trace the back the
copy history of every child when diffing two directories.

> > Since you know all paths and revisions involved, you could also run:
> >svn diff ^/foobar@1 ^/barfoo@2
> Well, either of these approaches is not very convenient when there is a
> dozen moves & modifications in a single revision.

Agreed. At least the file diffs allows you to 'zoom in', but it would
be much better if there was a way to get the diff you want to see
with just one command.

> Besides, the former (just passing the path) does not seem to work in all
> cases. In the real repository, I have two revisions that did the same thing:
> moved a directory and modified some files in the moved directory. The trick
> with passing the path to the file works for one of them, but not for the
> other - and I am at a loss why SVN treats these two differently. Here's
> where diff does not display the proper diff even when supplied with the path
> to the file:
> 
> # The relevant fragment of a revision
> $ svn log -c 36 -v file://`pwd`/XX-svn
>    A /trunk/XX/src/bin/more (from /vendor/:29)
>    M /trunk/XX/src/bin/more/more.c
> # Passing the path to the directory that was copied: does not work
> $ svn di -c 36 file://`pwd`/XX-svn/trunk/XX/src/bin/more | grep -A 4
> 'Index: more.c'
> Index: more.c
> ===
> --- more.c  (nonexistent)
> +++ more.c  (revision 36)
> @@ -0,0 +1,1894 @@
> # Passing the path to the specific file: does not work
> $ svn di -c 36 file://`pwd`/XX-svn/trunk/XX/src/bin/more/more.c |
> grep -A 4 'Index: more.c'
> Index: more.c
> ===
> --- more.c  (nonexistent)
> +++ more.c  (revision 36)
> @@ -0,0 +1,1894 @@
> # Manual, file-by-file: works, but doesn't scale to revisions with lots of
> modifications
> $ svn di 
> file://`pwd`/los178-svn{/vendor//more.c@29,/trunk/X/src/bin/more/more.c@36}
> | grep -A 4 'Index: more.c'
> Index: more.c
> ===
> --- more.c  (.../vendor/BSD/more/4.3Tahoe/more.c)   (revision 29)
> +++ more.c  (.../trunk/los178/src/bin/more/more.c)  (revision 36)
> @@ -1,3 +1,11 @@

I can't explain this one. It might be worth filing an issue about
this problem in case you can come up with a standalone recipe to
reproduce it.


Re: Show textual diff in a moved/copied file - how?

2018-02-26 Thread Alexey Neyman

On 02/26/2018 12:18 AM, Stefan Sperling wrote:

On Sun, Feb 25, 2018 at 11:38:03PM -0800, Alexey Neyman wrote:

Hi all,

I am trying to dig for some changes in a file that was moved a few times and
'svn diff' shows full "remove old location and add new location as if it
were a new file" diffs, which are not helpful. Is there a way to make the
diff show the changes as compared against the origin of the copy? I tried
--notice-ancestry, does not help.

Diff output changes depending on whether you pass a path to the
file itself or to a parent of the file. Try: svn diff -c 2 barfoo
I found this in the diff_renamed_file() test in diff_tests.py,
see there for more examples.
https://svn.apache.org/repos/asf/subversion/trunk/subversion/tests/cmdline/diff_tests.py
You don't expect the end-user to read the test cases in the product to 
get these subtleties, do you? :)


And, I find it quite counter-intuitive. I would expect --notice-ancestry 
at least to take ancestral relationship between these files into 
account; the currently shown diff is the same as if 'barfoo' were not 
copied but was created from scratch.


Since you know all paths and revisions involved, you could also run:
   svn diff ^/foobar@1 ^/barfoo@2
Well, either of these approaches is not very convenient when there is a 
dozen moves & modifications in a single revision.


Besides, the former (just passing the path) does not seem to work in all 
cases. In the real repository, I have two revisions that did the same 
thing: moved a directory and modified some files in the moved directory. 
The trick with passing the path to the file works for one of them, but 
not for the other - and I am at a loss why SVN treats these two 
differently. Here's where diff does not display the proper diff even 
when supplied with the path to the file:


# The relevant fragment of a revision
$ svn log -c 36 -v file://`pwd`/XX-svn
   A /trunk/XX/src/bin/more (from /vendor/:29)
   M /trunk/XX/src/bin/more/more.c
# Passing the path to the directory that was copied: does not work
$ svn di -c 36 file://`pwd`/XX-svn/trunk/XX/src/bin/more | grep 
-A 4 'Index: more.c'

Index: more.c
===
--- more.c  (nonexistent)
+++ more.c  (revision 36)
@@ -0,0 +1,1894 @@
# Passing the path to the specific file: does not work
$ svn di -c 36 file://`pwd`/XX-svn/trunk/XX/src/bin/more/more.c 
| grep -A 4 'Index: more.c'

Index: more.c
===
--- more.c  (nonexistent)
+++ more.c  (revision 36)
@@ -0,0 +1,1894 @@
# Manual, file-by-file: works, but doesn't scale to revisions with lots 
of modifications
$ svn di 
file://`pwd`/los178-svn{/vendor//more.c@29,/trunk/X/src/bin/more/more.c@36} 
| grep -A 4 'Index: more.c'

Index: more.c
===
--- more.c  (.../vendor/BSD/more/4.3Tahoe/more.c)   (revision 29)
+++ more.c  (.../trunk/los178/src/bin/more/more.c)  (revision 36)
@@ -1,3 +1,11 @@


Regards,
Alexey.




I have a vague recollection that 'svn diff' used to show the changes in such
copied files before - but I tried the small reproduction script below and it
shows the same, both with 1.7.22/1.8.17/1.9.7/trunk:

---8<---
#!/bin/bash

rm -rf /tmp/foo-{svn,wc}
svnadmin create /tmp/foo-svn
svn co file:///tmp/foo-svn foo-wc
cd foo-wc
echo foo > foobar
svn add foobar
svn ci -m "1"
svn mv foobar barfoo
echo bar >> barfoo
svn ci -m "2"
svn up
svn diff -c 2
svn --version
---8<---


Diff output:

---8<---
Index: foobar
===
--- foobar    (revision 1)
+++ foobar    (nonexistent)
@@ -1 +0,0 @@
-foo
Index: barfoo
===
--- barfoo    (nonexistent)
+++ barfoo    (revision 2)
@@ -0,0 +1,2 @@
+foo
+bar
---8<

Regards,
Alexey.





Re: Show textual diff in a moved/copied file - how?

2018-02-26 Thread Stefan Sperling
On Sun, Feb 25, 2018 at 11:38:03PM -0800, Alexey Neyman wrote:
> Hi all,
> 
> I am trying to dig for some changes in a file that was moved a few times and
> 'svn diff' shows full "remove old location and add new location as if it
> were a new file" diffs, which are not helpful. Is there a way to make the
> diff show the changes as compared against the origin of the copy? I tried
> --notice-ancestry, does not help.

Diff output changes depending on whether you pass a path to the
file itself or to a parent of the file. Try: svn diff -c 2 barfoo
I found this in the diff_renamed_file() test in diff_tests.py,
see there for more examples.
https://svn.apache.org/repos/asf/subversion/trunk/subversion/tests/cmdline/diff_tests.py

Since you know all paths and revisions involved, you could also run:
  svn diff ^/foobar@1 ^/barfoo@2

> I have a vague recollection that 'svn diff' used to show the changes in such
> copied files before - but I tried the small reproduction script below and it
> shows the same, both with 1.7.22/1.8.17/1.9.7/trunk:
> 
> ---8<---
> #!/bin/bash
> 
> rm -rf /tmp/foo-{svn,wc}
> svnadmin create /tmp/foo-svn
> svn co file:///tmp/foo-svn foo-wc
> cd foo-wc
> echo foo > foobar
> svn add foobar
> svn ci -m "1"
> svn mv foobar barfoo
> echo bar >> barfoo
> svn ci -m "2"
> svn up
> svn diff -c 2
> svn --version
> ---8<---
> 
> 
> Diff output:
> 
> ---8<---
> Index: foobar
> ===
> --- foobar    (revision 1)
> +++ foobar    (nonexistent)
> @@ -1 +0,0 @@
> -foo
> Index: barfoo
> ===
> --- barfoo    (nonexistent)
> +++ barfoo    (revision 2)
> @@ -0,0 +1,2 @@
> +foo
> +bar
> ---8<
> 
> Regards,
> Alexey.
>