Terry J. Reedy added the comment: In 3.2, it is line 1629: content="text/html; charset=ISO-8859-1" />
That charset was only standard for Western European documents limited to that charset. Now, even such limited-char docs often use 'utf-8' (python.org does). The result of putting an incorrect charset designation in an html file is that the browser will not display the file correctly. For instance, I tried an input sequence containing line 'c\u3333', which displays in IDLE as 'c㌳'. The string from HtmlDill.make_file() must be written to a file opened with encoding='utf-8', not the above or equivalent. Firefox then reads the three bytes of the utf-8 encoding as three separate characters and displays 'c㌳'. To check: >>> 'c㌳'.encode().decode(encoding='Latin-1') 'cã\x8c³' To me the clear implication of "returns a string which is a complete HTML file containing a table showing line by line differences with inter-line and intra-line changes highlighted." is that the resulting file will display correctly. The current template charset prevents that, changing to 'utf-8' results in a file that displays correctly (tested). So the current behavior and the code that causes it is to me clearly a bug. I would like to fix it before 2.7.4 comes out. ---------- nosy: +ezio.melotti, orsenthil, terry.reedy -tim_one _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2052> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com