Comment #3 on issue 2980 by Diff viewer has problems with control characters

The cause is that splitlines() behaves differently for unicode vs
regular strings.  For unicode, it considers form-feed a line breaking
character and splits on that.  But for regular strings it does not.
For example:

41 zara:~/> python -c 'print "A\n\f\nB\n".splitlines()'
['A', '\x0c', 'B']     # good
42 zara:~/> python -c 'print u"A\n\f\nB\n".splitlines()'
[u'A', u'', u'', u'B'] # ate \f and added extra empty elem


This means the lines will be off by the number of ^L chars above it in
the diff and won't match the diff.

The diff ends up getting stored in the DB w/o the ^L chars.  Sometimes
it won't patch, and sometimes it does but renders incorrectly.

There are a few places where splitlines() is used on the diff.  As an
experiment I modified DiffParser, DiffChunkGenerator,
TextBasedReviewUI, and filter_interdiff_opcodes to use a hacked
splitlines() routine that first swaps ^L with NUL, splits, then swaps
back and that makes things work (attached).  Not sure that is the best
strategy, but it ends up being a simple change.  A similar approach
would be to cons together a random number or UUID along with
"reviewboard" and use that as the swap string, or use some other

It would be nice to see this fixed, but I'm aware that the group of
Lisp-influenced Emacs-lovers who employ ^L in this manner might not be
a growing population.  However, if the RB developers think this is
worth persuing I can spend more time on a fix.

        0001-diffviewer-handle-L-characters-in-diffs.patch  4.6 KB

You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at:

You received this message because you are subscribed to the Google Groups 
"reviewboard-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To post to this group, send email to
Visit this group at
For more options, visit

Reply via email to