A problem appeared when I downloaded new version of mwlib yesterday.
Given, for example, wikitext:
<h1>☭</h1>
Using HTMLWriter we obtain an exception on getvalue(): 'ascii' codec
can't decode byte 0xe2 in position 0: ordinal not in range(128).
This exception raises because of text different types that wrote in
StringIO.
Wikitext:
''☭''
or Wikitext:
<i>☭</i>
works correctly, because this processed by Python:
self.out.write("<%s>" % tag)
which writes not unicode.
But <h1>,.. processed by HTMLWriter::writeTagNode, Python:
self.out.write(t.starttext)
where t.starttext is unicode.
But exception causes other text in HTMLWriter::_write, Python:
self.out.write(cgi.escape(s.encode("utf8")))
This writes non unicode, which is incorrect in general. I think
Python:
unicode(self.out.write(cgi.escape(s.encode("utf8"))),'utf-8')
must be here.
Or maybe I did something wrong?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---