Terry J. Reedy added the comment:
In 3.2, it is line 1629:
content="text/html; charset=ISO-8859-1" />
That charset was only standard for Western European documents limited to that
charset. Now, even such limited-char docs often use 'utf-8' (python.org does).
The result of putting an incorrect charset designation in an html file is that
the browser will not display the file correctly.
For instance, I tried an input sequence containing line 'c\u3333', which
displays in IDLE as 'c㌳'. The string from HtmlDill.make_file() must be written
to a file opened with encoding='utf-8', not the above or equivalent. Firefox
then reads the three bytes of the utf-8 encoding as three separate characters
and displays 'c㌳'. To check:
>>> 'c㌳'.encode().decode(encoding='Latin-1')
'cã\x8c³'
To me the clear implication of "returns a string which is a complete HTML file
containing a table showing line by line differences with inter-line and
intra-line changes highlighted." is that the resulting file will display
correctly. The current template charset prevents that, changing to 'utf-8'
results in a file that displays correctly (tested). So the current behavior and
the code that causes it is to me clearly a bug. I would like to fix it before
2.7.4 comes out.
----------
nosy: +ezio.melotti, orsenthil, terry.reedy -tim_one
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue2052>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com