When review board presents the differences between the baseline and the 
changed code, it considers that blank lines in the old code match blank 
lines in the new code. Therefore, it obscures what really changed because 
it will show the deletion of a large block of code as dozens of small 
chunks (blank lines in the old code sync with blank lines in the new code). 
Within the small chunks, if any individual lines happen to match up, they 
show as code that was moved. Worse, each small chunk is presented as a 
modified line followed by deleted lines. It does this for all changes, 
morphing large changes that contain blank lines into multiple small changes 
separated by blank lines.

Recently this became glaringly obvious (and can easily be recreated) with 
the following example: 
1) There were three functions in a single file that implemented 
approximately the same operation. When the operation needed enhancement, it 
needed to be enhanced in 3 places.
2) I wrote a single function and parameterized it.For lexical reasons the 
function needed to appear in front of its usage, and so was placed in the 
beginning of the file.
3) Each of the three existing functions had their ~50 lines of code deleted 
and replaced with a single line that called the common function, passing it 
parameters to customize the operation.

When the svn diff file was loaded into review board, the resulting diff 
shows dozens of changes: line modifications, deletions, code movements. 
When I saw the diff I first thought the change had gotten messed up because 
I could not recognize what had changed in the code I had just written! This 
occurred because every time RB sees a blank line it thinks the comparison 
of the two files is back in sync, so it tries to present the previous 3-10 
line chunk of contiguous code as [line modifications, deletions, and 
movements]. 

If it were possible to tell code review to NOT CONSIDER BLANK LINES 
RELEVANT WHEN COMPARING OLD CODE TO NEW CODE, then it would show the diff 
for this change as a large insertion (the new, parameterized function), 
followed by three sections of code, each with a single line that was 
modified and ~50 lines deleted.

Other difference engines use this approach. For example, comparing these 
same two (old,new) files, WinMerge shows 4 code sections changed: one 
insertion followed by 3 deletions.

Since the primary purpose of doing code reviews is to have someone else 
look at your code changes, in the hopes that they will discover a latent 
bug that you have missed, presentation of the code changes is of paramount 
importance. The algorithm ReviewBoard uses obfuscates the changes because 
of it considers blank lines relevant when identifying code chunk 
boundaries. If the blank lines were ignored, then changes would be 
identified at the largest grouping.

The most recent ReviewBoard fixed this problem for identifying chunks of 
code that have moved. Please create an example like I described and see 
what a mess the difference looks like. I think it would greatly improve 
Review Board if the algorithm were changed to ignore blank lines (or at 
least allow it as an option).

Thanks, Don

-- 
Supercharge your Review Board with Power Pack: 
https://www.reviewboard.org/powerpack/
Want us to host Review Board for you? Check out RBCommons: 
https://rbcommons.com/
Happy user? Let us know! https://www.reviewboard.org/users/
--- 
You received this message because you are subscribed to the Google Groups 
"reviewboard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to