#2905: UnicodeDecodeError
---------------------------------------------+------------------------------
Reporter: anonymous | Owner: cmlenz
Type: defect | Status: reopened
Priority: high | Milestone: 0.10
Component: general | Version: devel
Severity: normal | Resolution:
Keywords: UnicodeDecodeError unicode utf8 |
---------------------------------------------+------------------------------
Changes (by cboos):
* priority: normal => high
Comment:
Please try out the following patch, it works for me even without r3084:
{{{
Index: changeset.py
===================================================================
--- changeset.py (revision 3090)
+++ changeset.py (working copy)
@@ -420,16 +420,21 @@
The list is empty when no differences between comparable
files
are detected, but the return value is None for non-comparable
files.
"""
- data = old_node.get_content().read()
- if is_binary(data):
+ old_content = old_node.get_content().read()
+ if is_binary(old_content):
return None
- old_content = mimeview.to_utf8(data, old_node.content_type)
- data = new_node.get_content().read()
- if is_binary(data):
+ new_content = new_node.get_content().read()
+ if is_binary(new_content):
return None
- new_content = mimeview.to_utf8(data, new_node.content_type)
+ old_cset = mimeview.get_charset(old_content,
old_node.content_type)
+ new_cset = mimeview.get_charset(new_content,
new_node.content_type)
+ # character sets should be 'iso-8859-15' at the very least
+ assert old_cset and new_cset
+ old_content = unicode(old_content, old_cset)
+ new_content = unicode(new_content, new_cset)
+
if old_content != new_content:
context = 3
options = diff_options[1]
}}}
I'll follow-up with a more complete patch, which will also take care
of the textual diff view.
--
Ticket URL: <http://projects.edgewall.com/trac/ticket/2905>
The Trac Project <http://trac.edgewall.com/>
_______________________________________________
Trac-Tickets mailing list
[email protected]
http://lists.edgewall.com/mailman/listinfo/trac-tickets