[issue2052] Allow changing difflib._file_template character encoding.

Terry J. Reedy Tue, 19 Mar 2013 19:57:43 -0700

Terry J. Reedy added the comment:

In 3.2, it is line 1629:
          content="text/html; charset=ISO-8859-1" />


That charset was only standard for Western European documents limited to that 
charset. Now, even such limited-char docs often use 'utf-8' (python.org does). 
The result of putting an incorrect charset designation in an html file is that 
the browser will not display the file correctly.

For instance, I tried an input sequence containing line 'c\u3333', which 
displays in IDLE as  'c㌳'. The string from HtmlDill.make_file() must be written 
to a file opened with encoding='utf-8', not the above or equivalent. Firefox 
then reads the three bytes of the utf-8 encoding as three separate characters 
and displays 'cãŒ³'. To check:
>>> 'c㌳'.encode().decode(encoding='Latin-1')
'cã\x8c³'

To me the clear implication of "returns a string which is a complete HTML file 
containing a table showing line by line differences with inter-line and 
intra-line changes highlighted." is that the resulting file will display 
correctly. The current template charset prevents that, changing to 'utf-8' 
results in a file that displays correctly (tested). So the current behavior and 
the code that causes it is to me clearly a bug. I would like to fix it before 
2.7.4 comes out.

----------
nosy: +ezio.melotti, orsenthil, terry.reedy -tim_one

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue2052>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2052] Allow changing difflib._file_template character encoding.

Reply via email to