Comment #3 on issue 2980 by bart...@gmail.com: Diff viewer has problems with control characters
http://code.google.com/p/reviewboard/issues/detail?id=2980

The cause is that splitlines() behaves differently for unicode vs
regular strings.  For unicode, it considers form-feed a line breaking
character and splits on that.  But for regular strings it does not.
For example:

41 zara:~/> python -c 'print "A\n\f\nB\n".splitlines()'
['A', '\x0c', 'B']     # good
42 zara:~/> python -c 'print u"A\n\f\nB\n".splitlines()'
[u'A', u'', u'', u'B'] # ate \f and added extra empty elem

References:
http://bugs.python.org/issue12855
http://stackoverflow.com/questions/24453713/python-string-splitlines-removes-certain-unicode-control-characters

This means the lines will be off by the number of ^L chars above it in
the diff and won't match the diff.

The diff ends up getting stored in the DB w/o the ^L chars.  Sometimes
it won't patch, and sometimes it does but renders incorrectly.

There are a few places where splitlines() is used on the diff.  As an
experiment I modified DiffParser, DiffChunkGenerator,
TextBasedReviewUI, and filter_interdiff_opcodes to use a hacked
splitlines() routine that first swaps ^L with NUL, splits, then swaps
back and that makes things work (attached).  Not sure that is the best
strategy, but it ends up being a simple change.  A similar approach
would be to cons together a random number or UUID along with
"reviewboard" and use that as the swap string, or use some other
non-printable.

It would be nice to see this fixed, but I'm aware that the group of
Lisp-influenced Emacs-lovers who employ ^L in this manner might not be
a growing population.  However, if the RB developers think this is
worth persuing I can spend more time on a fix.


Attachments:
        0001-diffviewer-handle-L-characters-in-diffs.patch  4.6 KB

--
You received this message because this project is configured to send all issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups 
"reviewboard-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to reviewboard-issues+unsubscr...@googlegroups.com.
To post to this group, send email to reviewboard-issues@googlegroups.com.
Visit this group at http://groups.google.com/group/reviewboard-issues.
For more options, visit https://groups.google.com/d/optout.

Reply via email to