On Mon, Dec 31, 2012 at 7:07 PM, <rhuij...@apache.org> wrote: > Author: rhuijben > Date: Mon Dec 31 18:07:29 2012 > New Revision: 1427210 > > URL: http://svn.apache.org/viewvc?rev=1427210&view=rev > Log: > Following up on r1427197, apply a similar fix to the suffix scanning. Avoid > updating variables until they are known to be correct to allow breaking out > of the loop. And reset state variables after detecting a common suffix. > > Just like r1427197, the only behavior change should be in resetting the > state variables.
Great! Nice fix (seems issue #4270 is fixed with this revision, together with r1427197). Maybe both of these revisions should be nominated for backport? The bug has the potential to generate a wrong diff, so if it happens with diff3 this could also cause an incorrect result from update or merge. Just a small nit below... > * subversion/libsvn_diff/diff_file.c > (find_identical_suffix): Update loops and loop conditions to allow resetting > state variables. > > Modified: > subversion/trunk/subversion/libsvn_diff/diff_file.c > > Modified: subversion/trunk/subversion/libsvn_diff/diff_file.c > URL: > http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_diff/diff_file.c?rev=1427210&r1=1427209&r2=1427210&view=diff > ============================================================================== > --- subversion/trunk/subversion/libsvn_diff/diff_file.c (original) > +++ subversion/trunk/subversion/libsvn_diff/diff_file.c Mon Dec 31 18:07:29 > 2012 > @@ -628,32 +628,41 @@ find_identical_suffix(apr_off_t *suffix_ > can_read_word = can_read_word > && ( file_for_suffix[i].curp - > sizeof(apr_uintptr_t) > >= min_curp[i]); > - if (can_read_word) > + while (can_read_word) > { > - do > + apr_uintptr_t chunk; > + > + /* For each file curp is positioned at the next byte to read, but > we > + want to examine the bytes before the current location. */ > + > + chunk = *(const apr_uintptr_t *)(file_for_suffix[0].curp + 1 > + - sizeof(apr_uintptr_t)); > + if (contains_eol(chunk)) > + break; > + > + for (i = 1, is_match = TRUE; i < file_len; i++) > + is_match = is_match > + && ( chunk > + == *(const apr_uintptr_t *) > + (file_for_suffix[i].curp + 1 > + - sizeof(apr_uintptr_t))); > + > + if (! is_match) > + break; > + > + for (i = 0; i < file_len; i++) > { > - apr_uintptr_t chunk; > - for (i = 0; i < file_len; i++) > - file_for_suffix[i].curp -= sizeof(apr_uintptr_t); > - > - chunk = *(const apr_uintptr_t *)(file_for_suffix[0].curp + 1); > - if (contains_eol(chunk)) > - break; > - > - for (i = 0, can_read_word = TRUE; i < file_len; i++) > - can_read_word = can_read_word > - && ( file_for_suffix[i].curp - > sizeof(apr_uintptr_t) > - >= min_curp[i]); > - for (i = 1, is_match = TRUE; i < file_len; i++) > - is_match = is_match > - && ( chunk > - == *(const apr_uintptr_t > *)(file_for_suffix[i].curp + 1)); > - } while (can_read_word && is_match); > + file_for_suffix[i].curp -= sizeof(apr_uintptr_t); > + can_read_word = can_read_word > + && ( (file_for_suffix[i].curp > + - sizeof(apr_uintptr_t)) > + >= min_curp[i]); > + } > > - for (i = 0; i < file_len; i++) > - file_for_suffix[i].curp += sizeof(apr_uintptr_t); > + /* We skipped 4 bytes, so there are no closing EOLs */ > + had_nl = FALSE; > + had_cr = FALSE; had_cr isn't actually used in this function yet, so it doesn't have to be (re)set. It's unconditionally set to FALSE a couple of lines later, when it's used in the do-while loop that has the comment "Slide forward until we find an eol sequence ...". (I know ... I should have separated this large function in a couple of smaller ones so the variables could be better scoped, to avoid this confusion ... but neglected to do this) > } > - > #endif > > reached_prefix = file_for_suffix[0].chunk == suffix_min_chunk0 > > -- Johan