On 1/4/14, 7:20 AM, James McCoy wrote: > Given the repository svn://scm.gforge.inria.fr/svn/mpfr/misc/vl-tests: > > $ svn co $repo > A vl-tests2/mpfrtests.data > A vl-tests2/mpfrtests.sh > A vl-tests2/release-3.1.2-p4 > A vl-tests2/release-3.1.0-p8 > A vl-tests2/vfy-data > A vl-tests2/ReadMe > Checked out revision 8727. > $ cd vl-tests > $ svn diff -r 8276 ReadMe > Index: ReadMe > =================================================================== > Cannot display: file marked as a binary type. > svn: E000022: Valid UTF-8 data > (hex: 73 76 6e 3a 6d 69 6d 65 2d 74 79 70 65 20 3d 20 28 48 67) > followed by invalid UTF-8 sequence > (hex: fb 20 0a 7f) > > > The actual values reported vary among runs of "svn diff", sometimes > writing garbage characters to the terminal. It appears to only occur > with files that have the svn:mime-type property set. > > This works correctly with 1.7.13 and 1.8.5.
I've reproduced this on OS X, so it's not Debian specific. This is a client side implementation bug, there is no corruption in the repo. The client is incorrectly calculating the mime-type (it ends up being garbage). Since the mime-type doesn't match a text type it's treated as a binary type, which triggers an error. The error tries to print out the mime-type which since it's garbage isn't valid UTF-8 data, thus all the hex output. Users can somewhat workaround this in 1.7.14 by using --force (which will bypass the mime-type error). However it'll show a spurious property difference that isn't really there (which seems to be a different bug). What's really going on here for anyone that cares about the details is the major rework of diff code we added for issues #4153 and #4421 incorrectly assumes that the value of the property in the property hash is a C string when in fact it is a svn_string_t. I should have a fix here soon.