Re: Issue 2980 in reviewboard: Diff viewer has problems with control characters

reviewboard Fri, 15 Aug 2014 00:06:01 -0700

Comment #3 on issue 2980 by [email protected]: Diff viewer has problemswith control characters

http://code.google.com/p/reviewboard/issues/detail?id=2980

The cause is that splitlines() behaves differently for unicode vs
regular strings. For unicode, it considers form-feed a line breaking
character and splits on that. But for regular strings it does not.
For example:

41 zara:~/> python -c 'print "A\n\f\nB\n".splitlines()'
['A', '\x0c', 'B'] # good
42 zara:~/> python -c 'print u"A\n\f\nB\n".splitlines()'
[u'A', u'', u'', u'B'] # ate \f and added extra empty elem

References:
http://bugs.python.org/issue12855
http://stackoverflow.com/questions/24453713/python-string-splitlines-removes-certain-unicode-control-characters

This means the lines will be off by the number of ^L chars above it in
the diff and won't match the diff.

The diff ends up getting stored in the DB w/o the ^L chars. Sometimes
it won't patch, and sometimes it does but renders incorrectly.

There are a few places where splitlines() is used on the diff. As an
experiment I modified DiffParser, DiffChunkGenerator,
TextBasedReviewUI, and filter_interdiff_opcodes to use a hacked
splitlines() routine that first swaps ^L with NUL, splits, then swaps
back and that makes things work (attached). Not sure that is the best
strategy, but it ends up being a simple change. A similar approach
would be to cons together a random number or UUID along with
"reviewboard" and use that as the swap string, or use some other
non-printable.

It would be nice to see this fixed, but I'm aware that the group of
Lisp-influenced Emacs-lovers who employ ^L in this manner might not be
a growing population. However, if the RB developers think this is
worth persuing I can spend more time on a fix.

Attachments:
0001-diffviewer-handle-L-characters-in-diffs.patch 4.6 KB

You received this message because this project is configured to send allissue notifications to this address.

You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--
You received this message because you are subscribed to the Google Groups 
"reviewboard-issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/reviewboard-issues.
For more options, visit https://groups.google.com/d/optout.

Re: Issue 2980 in reviewboard: Diff viewer has problems with control characters

Reply via email to