Re: SHA-1 collision in repository?
Branko Čibejwrites: > On 22.02.2018 21:30, Myria wrote: >> When we try to commit a very specific version of a very specific >> binary file, we get a SHA-1 collision error from the Subversion >> repository: >> >> D:\confidential>svn commit secret.bin -m "Testing broken commit" >> Sendingsecret.bin >> Transmitting file data .svn: E16: Commit failed (details follow): >> svn: E16: SHA1 of reps '604440 34 134255 136680 >> c9f4fabc4d093612fece03c339401058 >> db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 >> 134255 136680 c9f4fabc4d093612fece03c339401058 >> db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches >> (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ >> >> >> What can cause this? > > The simplest explanation would be a corruption of the existing > representation on disk. Note that both the MD5 and the SHA1 checksums > appear to match, as do the sizes; which makes it even more likely that > it's the same file but the copy in the repository is somehow corrupted. That pattern, all of MD5, SHA1 and size matching, is exactly what happens if a SHA1 collision is committed using an old version of Subversion where the rep-cache does not detect collisions. The first part of the collision would have been committed in r604440 and the second part in r605556. If that is the case, and a SHA1 collision did occur, then: svnadmin verify -r604440 path/to/repository will succeed while: svnadmin verify -r605556 path/to/repository will fail with an MD5 checksum error. If this is what you see then unfortunately the colliding r605556 content has been elided and the r605556 revision is corrupt. You should be able to retrieve the first part of the collision from r604440, it will be one of the files given by: svn log -v -r604440 The second part in r605556 is missing :-( but it will be one of the files given by: svn log -v -r605556 However your failing commit would also be a SHA1 collision with the r604440 content (it might be identical to the missing content in r605556). -- Philip
Re: SHA-1 collision in repository?
I would get more advice from people here before you invest that time. I'm a relative amateur and would listen to people with more experience than myself. --Matt On Thu, Feb 22, 2018 at 2:29 PM, Myriawrote: > That was one document we ran into when searching, yes. > > We can do an svnsync, but this will take about a week to run--the > repository is 43 GB with 600,000 commits. I guess we'll start it now. > > On Thu, Feb 22, 2018 at 2:04 PM, Matt Simmons wrote: > > Hi Melissa, > > > > That definitely is interesting. > > > > I assume you have read > > http://blogs.collab.net/subversion/subversion-sha1- > collision-problem-statement-prevention-remediation-options > > > > If you do an svnsync to another location and attempt the commit there, > does > > the problem replicate itself? > > > > --Matt > > > > > > On Thu, Feb 22, 2018 at 12:30 PM, Myria wrote: > >> > >> When we try to commit a very specific version of a very specific > >> binary file, we get a SHA-1 collision error from the Subversion > >> repository: > >> > >> D:\confidential>svn commit secret.bin -m "Testing broken commit" > >> Sendingsecret.bin > >> Transmitting file data .svn: E16: Commit failed (details follow): > >> svn: E16: SHA1 of reps '604440 34 134255 136680 > >> c9f4fabc4d093612fece03c339401058 > >> db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 > >> 134255 136680 c9f4fabc4d093612fece03c339401058 > >> db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches > >> (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ > >> > >> > >> What can cause this? This file is a binary pixel shader compiled from > >> a build process. It's most certainly not Google's SHA-1 collision PDF > >> files. We also scanned the repository to confirm that nobody has > >> committed Google's collision files. > >> > >> Occam's Razor suggests that something is wrong with our repository or > >> Subversion itself, rather than this being a true SHA-1 collision. In > >> that case, what is wrong with our repository? > >> > >> If this really is a SHA-1 collision, it would be major cryptography > >> news that someone randomly ran into a second collision without even > >> trying. In that case, is there a method by which we could recover the > >> two files that supposedly have the same SHA-1? The collision doesn't > >> appear to be in the file itself, but in some sort of diff or revision > >> output? > >> > >> Thanks, > >> > >> Melissa > > > > > > > > > > -- > > "Today, vegetables... Tomorrow, the world!" > -- "Today, vegetables... Tomorrow, the world!"
Re: SHA-1 collision in repository?
That was one document we ran into when searching, yes. We can do an svnsync, but this will take about a week to run--the repository is 43 GB with 600,000 commits. I guess we'll start it now. On Thu, Feb 22, 2018 at 2:04 PM, Matt Simmonswrote: > Hi Melissa, > > That definitely is interesting. > > I assume you have read > http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options > > If you do an svnsync to another location and attempt the commit there, does > the problem replicate itself? > > --Matt > > > On Thu, Feb 22, 2018 at 12:30 PM, Myria wrote: >> >> When we try to commit a very specific version of a very specific >> binary file, we get a SHA-1 collision error from the Subversion >> repository: >> >> D:\confidential>svn commit secret.bin -m "Testing broken commit" >> Sendingsecret.bin >> Transmitting file data .svn: E16: Commit failed (details follow): >> svn: E16: SHA1 of reps '604440 34 134255 136680 >> c9f4fabc4d093612fece03c339401058 >> db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 >> 134255 136680 c9f4fabc4d093612fece03c339401058 >> db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches >> (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ >> >> >> What can cause this? This file is a binary pixel shader compiled from >> a build process. It's most certainly not Google's SHA-1 collision PDF >> files. We also scanned the repository to confirm that nobody has >> committed Google's collision files. >> >> Occam's Razor suggests that something is wrong with our repository or >> Subversion itself, rather than this being a true SHA-1 collision. In >> that case, what is wrong with our repository? >> >> If this really is a SHA-1 collision, it would be major cryptography >> news that someone randomly ran into a second collision without even >> trying. In that case, is there a method by which we could recover the >> two files that supposedly have the same SHA-1? The collision doesn't >> appear to be in the file itself, but in some sort of diff or revision >> output? >> >> Thanks, >> >> Melissa > > > > > -- > "Today, vegetables... Tomorrow, the world!"
Re: SHA-1 collision in repository?
On 22.02.2018 21:30, Myria wrote: > When we try to commit a very specific version of a very specific > binary file, we get a SHA-1 collision error from the Subversion > repository: > > D:\confidential>svn commit secret.bin -m "Testing broken commit" > Sendingsecret.bin > Transmitting file data .svn: E16: Commit failed (details follow): > svn: E16: SHA1 of reps '604440 34 134255 136680 > c9f4fabc4d093612fece03c339401058 > db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 > 134255 136680 c9f4fabc4d093612fece03c339401058 > db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches > (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ > > > What can cause this? The simplest explanation would be a corruption of the existing representation on disk. Note that both the MD5 and the SHA1 checksums appear to match, as do the sizes; which makes it even more likely that it's the same file but the copy in the repository is somehow corrupted. -- Brane
Re: SHA-1 collision in repository?
Hi Melissa, That definitely is interesting. I assume you have read http://blogs.collab.net/subversion/subversion-sha1-collision-problem-statement-prevention-remediation-options If you do an svnsync to another location and attempt the commit there, does the problem replicate itself? --Matt On Thu, Feb 22, 2018 at 12:30 PM, Myriawrote: > When we try to commit a very specific version of a very specific > binary file, we get a SHA-1 collision error from the Subversion > repository: > > D:\confidential>svn commit secret.bin -m "Testing broken commit" > Sendingsecret.bin > Transmitting file data .svn: E16: Commit failed (details follow): > svn: E16: SHA1 of reps '604440 34 134255 136680 > c9f4fabc4d093612fece03c339401058 > db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 > 134255 136680 c9f4fabc4d093612fece03c339401058 > db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches > (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ > > > What can cause this? This file is a binary pixel shader compiled from > a build process. It's most certainly not Google's SHA-1 collision PDF > files. We also scanned the repository to confirm that nobody has > committed Google's collision files. > > Occam's Razor suggests that something is wrong with our repository or > Subversion itself, rather than this being a true SHA-1 collision. In > that case, what is wrong with our repository? > > If this really is a SHA-1 collision, it would be major cryptography > news that someone randomly ran into a second collision without even > trying. In that case, is there a method by which we could recover the > two files that supposedly have the same SHA-1? The collision doesn't > appear to be in the file itself, but in some sort of diff or revision > output? > > Thanks, > > Melissa > -- "Today, vegetables... Tomorrow, the world!"
SHA-1 collision in repository?
When we try to commit a very specific version of a very specific binary file, we get a SHA-1 collision error from the Subversion repository: D:\confidential>svn commit secret.bin -m "Testing broken commit" Sendingsecret.bin Transmitting file data .svn: E16: Commit failed (details follow): svn: E16: SHA1 of reps '604440 34 134255 136680 c9f4fabc4d093612fece03c339401058 db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' and '-1 0 134255 136680 c9f4fabc4d093612fece03c339401058 db11617ef1454332336e00abc311d44bc698f3b3 605556-czmh/_8' matches (db11617ef1454332336e00abc311d44bc698f3b3) but contents differ What can cause this? This file is a binary pixel shader compiled from a build process. It's most certainly not Google's SHA-1 collision PDF files. We also scanned the repository to confirm that nobody has committed Google's collision files. Occam's Razor suggests that something is wrong with our repository or Subversion itself, rather than this being a true SHA-1 collision. In that case, what is wrong with our repository? If this really is a SHA-1 collision, it would be major cryptography news that someone randomly ran into a second collision without even trying. In that case, is there a method by which we could recover the two files that supposedly have the same SHA-1? The collision doesn't appear to be in the file itself, but in some sort of diff or revision output? Thanks, Melissa
Re: auto-props syntax in file vs. property
Hi Brane, thanks for the reply. Then I understand why it's acting the way it is. It would have been nicer with different separators for the two cases, but it is what it is and I agree that it works. The downside is that my initial fix (earlier in this thread) for svn_apply_autoprops.py isn't correct since I need to prune the second ; before it calls propset. Need to try another fix then (unless someone has fixed that in the repo already) /Chris On Thu, 2/22/18, Branko Čibejwrote: Subject: Re: auto-props syntax in file vs. property To: users@subversion.apache.org Date: Thursday, February 22, 2018, 2:10 PM On 22.02.2018 13:52, Chris wrote: > Re-awakening my previous thread about the auto-properties. I get really confused by where to use ;; and ; as a separator. > > I currently have this in the auto-props on the repo: > *.txt = svn:mime-type=text/plain;;charset=iso-8859-1;svn:eol-style=LF > > And then if I add a text file: > prompt> touch foo.txt; svn add foo.txt; svn pg svn:mime-type foo.txt > A foo.txt > text/plain;charset=iso-8859-1 More completely: an 'svn proplist -v' would show the folloing properties: svn:mime-type=text/plain;charset=iso-8859-1 svn:eol-style=LF > So the property itself is with just one semicolon in there despite the auto-prop having ;; > Is this the correct behavior? Yes of course. In the auto-props configuration, a single colon separates individual properties. If you want a colon within a property value, you have to write ;; in the auto-props configuration to get the ; in the property value. If instead you'd had this auto-props configuration: *.txt = svn:mime-type=text/plain;charset=iso-8859-1;svn:eol-style=LF Then, when you added a file to Subversion, you'd get the following properties set: svn:mime-type=text/plain charset=iso-8859-1 svn:eol-style=LF which is probably not what you want. > While if I to the same thing manually, i.e. > > prompt> touch foo; svn add foo; svn propset svn:mime-type "text/plain;;charset=iso-8859-1" foo; svn pg svn:mime-type foo > A foo > property 'svn:mime-type' set on 'foo' > text/plain;;charset=iso-8859-1 > > That is, I'm passing in the exact string that I have in my auto-props into propset for a file without .txt-suffix so I don't get the auto-properties. But as you see in the resulting property that I now have has double semi-colons. On the command-line you can only set a singly property value at a time, so there's no need to escape the ';' delimiter. But the auto-props configuration isn't a single property; it's several properties, delimited with a single ';'. > My guess is that the former is the intended behavior and I should not be passing in the ";;" into the manual command, Yes. > but I'm getting really confused here. I seems very error-prone that manual propset can't use the strings from the config file or auto-props wihtout getting a different result. > > Which version is the correct one, or do both actually do the job? Each does its job in its own context. -- Brane
Re: auto-props syntax in file vs. property
On 22.02.2018 13:52, Chris wrote: > Re-awakening my previous thread about the auto-properties. I get really > confused by where to use ;; and ; as a separator. > > I currently have this in the auto-props on the repo: > *.txt = svn:mime-type=text/plain;;charset=iso-8859-1;svn:eol-style=LF > > And then if I add a text file: > prompt> touch foo.txt; svn add foo.txt; svn pg svn:mime-type foo.txt > A foo.txt > text/plain;charset=iso-8859-1 More completely: an 'svn proplist -v' would show the folloing properties: svn:mime-type=text/plain;charset=iso-8859-1 svn:eol-style=LF > So the property itself is with just one semicolon in there despite the > auto-prop having ;; > Is this the correct behavior? Yes of course. In the auto-props configuration, a single colon separates individual properties. If you want a colon within a property value, you have to write ;; in the auto-props configuration to get the ; in the property value. If instead you'd had this auto-props configuration: *.txt = svn:mime-type=text/plain;charset=iso-8859-1;svn:eol-style=LF Then, when you added a file to Subversion, you'd get the following properties set: svn:mime-type=text/plain charset=iso-8859-1 svn:eol-style=LF which is probably not what you want. > While if I to the same thing manually, i.e. > > prompt> touch foo; svn add foo; svn propset svn:mime-type > "text/plain;;charset=iso-8859-1" foo; svn pg svn:mime-type foo > A foo > property 'svn:mime-type' set on 'foo' > text/plain;;charset=iso-8859-1 > > That is, I'm passing in the exact string that I have in my auto-props into > propset for a file without .txt-suffix so I don't get the auto-properties. > But as you see in the resulting property that I now have has double > semi-colons. On the command-line you can only set a singly property value at a time, so there's no need to escape the ';' delimiter. But the auto-props configuration isn't a single property; it's several properties, delimited with a single ';'. > My guess is that the former is the intended behavior and I should not be > passing in the ";;" into the manual command, Yes. > but I'm getting really confused here. I seems very error-prone that manual > propset can't use the strings from the config file or auto-props wihtout > getting a different result. > > Which version is the correct one, or do both actually do the job? Each does its job in its own context. -- Brane
Re: auto-props syntax in file vs. property
Re-awakening my previous thread about the auto-properties. I get really confused by where to use ;; and ; as a separator. I currently have this in the auto-props on the repo: *.txt = svn:mime-type=text/plain;;charset=iso-8859-1;svn:eol-style=LF And then if I add a text file: prompt> touch foo.txt; svn add foo.txt; svn pg svn:mime-type foo.txt A foo.txt text/plain;charset=iso-8859-1 So the property itself is with just one semicolon in there despite the auto-prop having ;; Is this the correct behavior? While if I to the same thing manually, i.e. prompt> touch foo; svn add foo; svn propset svn:mime-type "text/plain;;charset=iso-8859-1" foo; svn pg svn:mime-type foo A foo property 'svn:mime-type' set on 'foo' text/plain;;charset=iso-8859-1 That is, I'm passing in the exact string that I have in my auto-props into propset for a file without .txt-suffix so I don't get the auto-properties. But as you see in the resulting property that I now have has double semi-colons. My guess is that the former is the intended behavior and I should not be passing in the ";;" into the manual command, but I'm getting really confused here. I seems very error-prone that manual propset can't use the strings from the config file or auto-props wihtout getting a different result. Which version is the correct one, or do both actually do the job? BR Chris On Wed, 1/10/18, Daniel Shahafwrote: Subject: Re: auto-props syntax in file vs. property To: "Chris" , users@subversion.apache.org Date: Wednesday, January 10, 2018, 8:51 PM Chris wrote on Wed, 10 Jan 2018 08:26 +: > I think the fix to svn_apply_autoprops.py should be something like below > (/subversion/trunk/contrib/client-side/svn_apply_autoprops.py) > If anyone with commit rights wants to fix it on the repo, feel free to > use the below, or improve it as necessary (my python knowledge is non- > existing) > > Index: svn_apply_autoprops.py > === > --- svn_apply_autoprops.py (revision 103617) > +++ svn_apply_autoprops.py (revision 103618) > @@ -101,7 +101,11 @@ > # leading and trailing whitespce from the propery names and > # values. > props_list = [] > - for prop in props.split(';'): > + # Since ;; is a separator within one property, we need to do > + # regex and use both negative lookahead and lookbehind to avoid > + # ever matching a more than one semicolon in the split > + semicolonpattern = re.compile("(? + for prop in re.split(semicolonpattern, props): That's clever, but it will misparse sequences of three or more semicolons in a row, such as *.foo = key=val;;with;;semicolons;;;anotherkey=anotherval Daniel