Thank you, Melissa, for tracking this down! Your effort is very much appreciated.
Cheers, Stefan On Wed, Mar 07, 2018 at 11:18:43PM +0000, Philip Martin wrote: > [Moving this to the dev@s.a.o list.] > > Well done! It looks like you have identified a serious bug here. The > function svn_fs_fs__get_contents_from_file() that was recently added to > 1.9 for the SHA1 collision detection so the code is new and it is also > different from that on trunk. > > Your proposed fix looks like it might be correct, although rb->rep is > just rep so it could be a bit simpler. One other consequence of this > bug is that the checksum calculated in rep_read_contents is not > finalized. > > Myria <myriac...@gmail.com> writes: > > > During rep_write_contents_close, there is a call to get_shared_rep. > > get_shared_rep calls svn_fs_fs__get_contents_from_file, which has the > > code pasted below. > > > > > > /* Build the representation list (delta chain). */ > > if (rh->type == svn_fs_fs__rep_plain) > > { > > rb->rs_list = apr_array_make(pool, 0, sizeof(rep_state_t *)); > > rb->src_state = rs; > > } > > else if (rh->type == svn_fs_fs__rep_self_delta) > > { > > rb->rs_list = apr_array_make(pool, 1, sizeof(rep_state_t *)); > > APR_ARRAY_PUSH(rb->rs_list, rep_state_t *) = rs; > > rb->src_state = NULL; > > } > > else > > { > > representation_t next_rep = { 0 }; > > > > /* skip "SVNx" diff marker */ > > rs->current = 4; > > > > /* REP's base rep is inside a proper revision. > > * It can be reconstructed in the usual way. */ > > next_rep.revision = rh->base_revision; > > next_rep.item_index = rh->base_item_index; > > next_rep.size = rh->base_length; > > svn_fs_fs__id_txn_reset(&next_rep.txn_id); > > > > SVN_ERR(build_rep_list(&rb->rs_list, &rb->base_window, > > &rb->src_state, &rb->len, rb->fs, &next_rep, > > rb->filehandle_pool)); > > > > > > The bug is occurring because build_rep_list is being called with > > first_rep->expanded_size set to zero. Well, the reason it's zero is > > because first_rep is the second to last parameter to build_rep_list, > > and the above code initialized expanded_size to zero: > > > > representation_t next_rep = { 0 }; > > > > Does the code just need this? I don't know this call >.< > > > > next_rep.expanded_size = rb->rep.expanded_size; > > > > Melissa > > > > On Wed, Mar 7, 2018 at 9:02 AM, Nathan Hartman <hartman.nat...@gmail.com> > > wrote: > >> On Mar 5, 2018, at 10:54 PM, Myria <myriac...@gmail.com> wrote: > >>> > >>> Final email for the night >.< > >>> > >>> What's clobbering the expanded_size is this in build_rep_list: > >>> > >>> /* The value as stored in the data struct. > >>> 0 is either for unknown length or actually zero length. */ > >>> *expanded_size = first_rep->expanded_size; > >>> > >>> first_rep->expanded_size here is zero for the last call to this > >>> function before the error. In every other case before the error, the > >>> two values are equal. > >>> > >>> Then this code executes: > >>> > >>> if (*expanded_size == 0) > >>> if (rep_header->type == svn_fs_fs__rep_plain || first_rep->size != 4) > >>> *expanded_size = first_rep->size; > >>> > >>> first_rep->size is 16384, and this is why rb->len becomes 16384, > >>> leading to the error. > >>> > >>> I don't know what all this code is doing, but that's the proximate > >>> cause of the failure. > >>> > >>> Melissa > >> > >> Has it been possible to determine what is setting expanded_size to 0 > >> before that last call? I wonder if there is specific logic that > >> decides (perhaps incorrectly?) to do that? > >> > >> Alternatively is it being clobbered by some out-of-bounds access, > >> use-after-free, or another such issue? > >> > >> Is it possible in your debugger setup to determine the address of > >> that variable and set a breakpoint that triggers when that memory is > >> written? (It may be called a watchpoint?) > >> > >> Which leads me to another thought: if you can set such a breakpoint > >> / watchpoint and it does not trigger, then this expanded_size might > >> not be the same instance in that final call. Perhaps a shallow copy > >> of an enclosing structure is made which leaves out the known size > >> and sets it to 0 for some reason, and that final call is given that > >> (incomplete) copy. > >> > >> Caveat: I am not familiar with the codebase but these are my > >> thoughts based on adventures in other code bases. > >> > > > > -- > Philip