Re: SHA-1 collision in repository?

2018-03-05 Thread Myria
Final email for the night >.<

What's clobbering the expanded_size is this in build_rep_list:

  /* The value as stored in the data struct.
 0 is either for unknown length or actually zero length. */
  *expanded_size = first_rep->expanded_size;

first_rep->expanded_size here is zero for the last call to this
function before the error.  In every other case before the error, the
two values are equal.

Then this code executes:

  if (*expanded_size == 0)
if (rep_header->type == svn_fs_fs__rep_plain || first_rep->size != 4)
  *expanded_size = first_rep->size;

first_rep->size is 16384, and this is why rb->len becomes 16384,
leading to the error.

I don't know what all this code is doing, but that's the proximate
cause of the failure.

Melissa


On Mon, Mar 5, 2018 at 7:41 PM, Myria  wrote:
> When Subversion gets to this part of rep_read_contents, rb->len is
> 16384.  It thinks it is then done reading the entire file, and can
> compare the checksum, but it's not done with the file yet.
>
> rb->rep.expanded_size is correct at the error point, 57465.
> rep_read_get_baton sets rb->len to rb->rep.expanded_size, so I don't
> know why the value changed by the time rep_read_contents got its paws
> on the baton.  I saw that rb->len might be getting clobbered by
> rep_read_content's call to build_rep_list, which has the following
> line of code:
>
> *expanded_size = first_rep->expanded_size;
>
> expanded_size is >len.  I haven't had a chance to debug this area
> yet, so it might be fine.
>
> I verified with sqlite3 that the rep-cache.db has the correct size (57465):
>
> $ sqlite3 /mnt/d/svnclone/db/rep-cache.db "select * from rep_cache
> where hash='e6291ab119036eb783d0136afccdb3b445867364'"
> e6291ab119036eb783d0136afccdb3b445867364|227170|153|193|57465
>
>
> On Mon, Mar 5, 2018 at 6:56 PM, Myria  wrote:
>> GMail keeps doing reply instead of reply all.  I'm having to manually
>> add the users list back now.
>>
>> Below is the thread I sent.
>>
>>
>> -- Forwarded message --
>> From: Myria 
>> Date: Mon, Mar 5, 2018 at 6:37 PM
>> Subject: Re: SHA-1 collision in repository?
>> To: Philip Martin 
>>
>>
>> I now know where the checksum error happens, but not why.
>>
>> svn: E200014: Checksum mismatch while reading representation:
>>expected:  bb52be764a04d511ebb06e1889910dcf
>>  actual:  80a10d37de91cadc604ba30e379651b3
>>
>> It's calculating the MD5 of only the first 16 KB of the input file and
>> comparing against the MD5 of the entire file.  The 16 KB number seems
>> to be SVN__STREAM_CHUNK_SIZE.
>>
>> bb52be764a04d511ebb06e1889910dcf is the MD5 of the entire file.
>> 80a10d37de91cadc604ba30e379651b3 is the MD5 of the first 16384 bytes.
>>
>>
>> On Mon, Mar 5, 2018 at 5:23 PM, Myria  wrote:
>>> I managed to compile a subversion command line client with debugging
>>> information and optimizations disabled, and can reproduce the problem
>>> with GDB attached.
>>>
>>> Here is a backtrace at the time at which the error occurs.  A few line
>>> numbers in stream.c will be wrong by a few lines due to a few printf's
>>> I added.
>>>
>>> #0  svn_checksum_mismatch_err (expected=0x7ffdcf00,
>>> actual=0x7a0700a0, scratch_pool=0x7a070028,
>>> fmt=0x7c259ac0 "Checksum mismatch while reading
>>> representation") at subversion/libsvn_subr/checksum.c:638
>>> #1  0x7c2123de in rep_read_contents (baton=0x7a1f6190,
>>> buf=0x7a1f66a8 "// "..., len=0x7ffdcf88)
>>> at subversion/libsvn_fs_fs/cached_data.c:2062
>>> #2  0x7e5645fd in svn_stream_read_full (stream=0x7a1f6470,
>>> buffer=0x7a1f66a8 "// "..., len=0x7ffdcf88)
>>> at subversion/libsvn_subr/stream.c:193
>>> #3  0x7e5653f3 in svn_stream_contents_same2
>>> (same=0x7ffdd01c, stream1=0x7a1f6470,
>>> stream2=0x7a1f6650, pool=0x7a1e0028) at
>>> subversion/libsvn_subr/stream.c:589
>>> #4  0x7c247226 in get_shared_rep (old_rep=0x7ffdd188,
>>> fs=0x7f601030, rep=0x7a0e20b8,
>>> file=0x7a1e0390, offset=0, reps_hash=0x0,
>>> result_pool=0x7f5e0028, scratch_pool=0x7a1e0028)
>>> at subversion/libsvn_fs_fs/transaction.c:2280
>>> #5  0x7c247734 in rep_write_contents_close
>>> (baton=0x7a232ff0) at subversion/libsvn_fs_fs/transaction.c:2370
>>> #6  0x7e56492b in svn_stream_close (stream=0x7a233140) at
>>> subversion/libsvn_subr/stream.c:274
>>> #7  0x7e841001 in apply_window (window=0x0,
>>> baton=0x7a1000a0) at subversion/libsvn_delta/text_delta.c:732
>>> #8  0x7c2520d2 in window_consumer (window=0x0,
>>> baton=0x7f5f1ab8) at subversion/libsvn_fs_fs/tree.c:2935
>>> #9  0x7e8405ef in svn_txdelta_run (source=0x7f5f1a18,
>>> target=0x7f5f1298,
>>> handler=0x7c25209f ,
>>> handler_baton=0x7f5f1ab8, checksum_kind=svn_checksum_md5,
>>> 

Re: SHA-1 collision in repository?

2018-03-05 Thread Myria
When Subversion gets to this part of rep_read_contents, rb->len is
16384.  It thinks it is then done reading the entire file, and can
compare the checksum, but it's not done with the file yet.

rb->rep.expanded_size is correct at the error point, 57465.
rep_read_get_baton sets rb->len to rb->rep.expanded_size, so I don't
know why the value changed by the time rep_read_contents got its paws
on the baton.  I saw that rb->len might be getting clobbered by
rep_read_content's call to build_rep_list, which has the following
line of code:

*expanded_size = first_rep->expanded_size;

expanded_size is >len.  I haven't had a chance to debug this area
yet, so it might be fine.

I verified with sqlite3 that the rep-cache.db has the correct size (57465):

$ sqlite3 /mnt/d/svnclone/db/rep-cache.db "select * from rep_cache
where hash='e6291ab119036eb783d0136afccdb3b445867364'"
e6291ab119036eb783d0136afccdb3b445867364|227170|153|193|57465


On Mon, Mar 5, 2018 at 6:56 PM, Myria  wrote:
> GMail keeps doing reply instead of reply all.  I'm having to manually
> add the users list back now.
>
> Below is the thread I sent.
>
>
> -- Forwarded message --
> From: Myria 
> Date: Mon, Mar 5, 2018 at 6:37 PM
> Subject: Re: SHA-1 collision in repository?
> To: Philip Martin 
>
>
> I now know where the checksum error happens, but not why.
>
> svn: E200014: Checksum mismatch while reading representation:
>expected:  bb52be764a04d511ebb06e1889910dcf
>  actual:  80a10d37de91cadc604ba30e379651b3
>
> It's calculating the MD5 of only the first 16 KB of the input file and
> comparing against the MD5 of the entire file.  The 16 KB number seems
> to be SVN__STREAM_CHUNK_SIZE.
>
> bb52be764a04d511ebb06e1889910dcf is the MD5 of the entire file.
> 80a10d37de91cadc604ba30e379651b3 is the MD5 of the first 16384 bytes.
>
>
> On Mon, Mar 5, 2018 at 5:23 PM, Myria  wrote:
>> I managed to compile a subversion command line client with debugging
>> information and optimizations disabled, and can reproduce the problem
>> with GDB attached.
>>
>> Here is a backtrace at the time at which the error occurs.  A few line
>> numbers in stream.c will be wrong by a few lines due to a few printf's
>> I added.
>>
>> #0  svn_checksum_mismatch_err (expected=0x7ffdcf00,
>> actual=0x7a0700a0, scratch_pool=0x7a070028,
>> fmt=0x7c259ac0 "Checksum mismatch while reading
>> representation") at subversion/libsvn_subr/checksum.c:638
>> #1  0x7c2123de in rep_read_contents (baton=0x7a1f6190,
>> buf=0x7a1f66a8 "// "..., len=0x7ffdcf88)
>> at subversion/libsvn_fs_fs/cached_data.c:2062
>> #2  0x7e5645fd in svn_stream_read_full (stream=0x7a1f6470,
>> buffer=0x7a1f66a8 "// "..., len=0x7ffdcf88)
>> at subversion/libsvn_subr/stream.c:193
>> #3  0x7e5653f3 in svn_stream_contents_same2
>> (same=0x7ffdd01c, stream1=0x7a1f6470,
>> stream2=0x7a1f6650, pool=0x7a1e0028) at
>> subversion/libsvn_subr/stream.c:589
>> #4  0x7c247226 in get_shared_rep (old_rep=0x7ffdd188,
>> fs=0x7f601030, rep=0x7a0e20b8,
>> file=0x7a1e0390, offset=0, reps_hash=0x0,
>> result_pool=0x7f5e0028, scratch_pool=0x7a1e0028)
>> at subversion/libsvn_fs_fs/transaction.c:2280
>> #5  0x7c247734 in rep_write_contents_close
>> (baton=0x7a232ff0) at subversion/libsvn_fs_fs/transaction.c:2370
>> #6  0x7e56492b in svn_stream_close (stream=0x7a233140) at
>> subversion/libsvn_subr/stream.c:274
>> #7  0x7e841001 in apply_window (window=0x0,
>> baton=0x7a1000a0) at subversion/libsvn_delta/text_delta.c:732
>> #8  0x7c2520d2 in window_consumer (window=0x0,
>> baton=0x7f5f1ab8) at subversion/libsvn_fs_fs/tree.c:2935
>> #9  0x7e8405ef in svn_txdelta_run (source=0x7f5f1a18,
>> target=0x7f5f1298,
>> handler=0x7c25209f ,
>> handler_baton=0x7f5f1ab8, checksum_kind=svn_checksum_md5,
>> checksum=0x7ffdd458, cancel_func=0x0, cancel_baton=0x0,
>> result_pool=0x7f5e0028,
>> scratch_pool=0x7f5e0028) at subversion/libsvn_delta/text_delta.c:454
>> #10 0x7ee98a57 in svn_wc__internal_transmit_text_deltas 
>> (tempfile=0x0,
>> new_text_base_md5_checksum=0x7ffdd5b0,
>> new_text_base_sha1_checksum=0x7ffdd5b8, db=0x7f6c17d8,
>> local_abspath=0x7f672d08
>> "/mnt/d/svntest/repository/directory/Redacted.cpp",
>> fulltext=0, editor=0x7f673700, file_baton=0x7f510110,
>> result_pool=0x7f6c0028,
>> scratch_pool=0x7f5e0028) at subversion/libsvn_wc/adm_crawler.c:1109
>> #11 0x7ee98d68 in svn_wc_transmit_text_deltas3
>> (new_text_base_md5_checksum=0x7ffdd5b0,
>> new_text_base_sha1_checksum=0x7ffdd5b8, wc_ctx=0x7f6c17c0,
>> local_abspath=0x7f672d08
>> "/mnt/d/svntest/repository/directory/Redacted.cpp",
>> fulltext=0, 

Fwd: SHA-1 collision in repository?

2018-03-05 Thread Myria
GMail keeps doing reply instead of reply all.  I'm having to manually
add the users list back now.

Below is the thread I sent.


-- Forwarded message --
From: Myria 
Date: Mon, Mar 5, 2018 at 6:37 PM
Subject: Re: SHA-1 collision in repository?
To: Philip Martin 


I now know where the checksum error happens, but not why.

svn: E200014: Checksum mismatch while reading representation:
   expected:  bb52be764a04d511ebb06e1889910dcf
 actual:  80a10d37de91cadc604ba30e379651b3

It's calculating the MD5 of only the first 16 KB of the input file and
comparing against the MD5 of the entire file.  The 16 KB number seems
to be SVN__STREAM_CHUNK_SIZE.

bb52be764a04d511ebb06e1889910dcf is the MD5 of the entire file.
80a10d37de91cadc604ba30e379651b3 is the MD5 of the first 16384 bytes.


On Mon, Mar 5, 2018 at 5:23 PM, Myria  wrote:
> I managed to compile a subversion command line client with debugging
> information and optimizations disabled, and can reproduce the problem
> with GDB attached.
>
> Here is a backtrace at the time at which the error occurs.  A few line
> numbers in stream.c will be wrong by a few lines due to a few printf's
> I added.
>
> #0  svn_checksum_mismatch_err (expected=0x7ffdcf00,
> actual=0x7a0700a0, scratch_pool=0x7a070028,
> fmt=0x7c259ac0 "Checksum mismatch while reading
> representation") at subversion/libsvn_subr/checksum.c:638
> #1  0x7c2123de in rep_read_contents (baton=0x7a1f6190,
> buf=0x7a1f66a8 "// "..., len=0x7ffdcf88)
> at subversion/libsvn_fs_fs/cached_data.c:2062
> #2  0x7e5645fd in svn_stream_read_full (stream=0x7a1f6470,
> buffer=0x7a1f66a8 "// "..., len=0x7ffdcf88)
> at subversion/libsvn_subr/stream.c:193
> #3  0x7e5653f3 in svn_stream_contents_same2
> (same=0x7ffdd01c, stream1=0x7a1f6470,
> stream2=0x7a1f6650, pool=0x7a1e0028) at
> subversion/libsvn_subr/stream.c:589
> #4  0x7c247226 in get_shared_rep (old_rep=0x7ffdd188,
> fs=0x7f601030, rep=0x7a0e20b8,
> file=0x7a1e0390, offset=0, reps_hash=0x0,
> result_pool=0x7f5e0028, scratch_pool=0x7a1e0028)
> at subversion/libsvn_fs_fs/transaction.c:2280
> #5  0x7c247734 in rep_write_contents_close
> (baton=0x7a232ff0) at subversion/libsvn_fs_fs/transaction.c:2370
> #6  0x7e56492b in svn_stream_close (stream=0x7a233140) at
> subversion/libsvn_subr/stream.c:274
> #7  0x7e841001 in apply_window (window=0x0,
> baton=0x7a1000a0) at subversion/libsvn_delta/text_delta.c:732
> #8  0x7c2520d2 in window_consumer (window=0x0,
> baton=0x7f5f1ab8) at subversion/libsvn_fs_fs/tree.c:2935
> #9  0x7e8405ef in svn_txdelta_run (source=0x7f5f1a18,
> target=0x7f5f1298,
> handler=0x7c25209f ,
> handler_baton=0x7f5f1ab8, checksum_kind=svn_checksum_md5,
> checksum=0x7ffdd458, cancel_func=0x0, cancel_baton=0x0,
> result_pool=0x7f5e0028,
> scratch_pool=0x7f5e0028) at subversion/libsvn_delta/text_delta.c:454
> #10 0x7ee98a57 in svn_wc__internal_transmit_text_deltas (tempfile=0x0,
> new_text_base_md5_checksum=0x7ffdd5b0,
> new_text_base_sha1_checksum=0x7ffdd5b8, db=0x7f6c17d8,
> local_abspath=0x7f672d08
> "/mnt/d/svntest/repository/directory/Redacted.cpp",
> fulltext=0, editor=0x7f673700, file_baton=0x7f510110,
> result_pool=0x7f6c0028,
> scratch_pool=0x7f5e0028) at subversion/libsvn_wc/adm_crawler.c:1109
> #11 0x7ee98d68 in svn_wc_transmit_text_deltas3
> (new_text_base_md5_checksum=0x7ffdd5b0,
> new_text_base_sha1_checksum=0x7ffdd5b8, wc_ctx=0x7f6c17c0,
> local_abspath=0x7f672d08
> "/mnt/d/svntest/repository/directory/Redacted.cpp",
> fulltext=0, editor=0x7f673700, file_baton=0x7f510110,
> result_pool=0x7f6c0028,
> scratch_pool=0x7f5e0028) at subversion/libsvn_wc/adm_crawler.c:1199
> #12 0x7f18eb12 in svn_client__do_commit (
> base_url=0x7f6142c0 "file:///mnt/d/svntest/repository/directory",
> commit_items=0x7f672c48, editor=0x7f673700,
> edit_baton=0x7f6300a0,
> notify_path_prefix=0x7f672900 "/mnt/d/svntest/repository",
> sha1_checksums=0x7ffdd750,
> ctx=0x7f6c16f0, result_pool=0x7f6c0028, 
> scratch_pool=0x7f650028)
> at subversion/libsvn_client/commit_util.c:1920
> #13 0x7f18a5f9 in svn_client_commit6 (targets=0x7f670a18,
> depth=svn_depth_infinity, keep_locks=0,
> keep_changelists=0, commit_as_operations=1,
> include_file_externals=0, include_dir_externals=0,
> changelists=0x7f6c0780, revprop_table=0x0,
> commit_callback=0x42c6a0 ,
> commit_baton=0x0, ctx=0x7f6c16f0, pool=0x7f6c0028) at
> subversion/libsvn_client/commit.c:901
> #14 0x0040b744 in svn_cl__commit (os=0x7f6c0520,
> baton=0x7ffddc60,