A problem appeared when I downloaded new version of mwlib yesterday.
Given, for example, wikitext:
  <h1>☭</h1>

Using HTMLWriter we obtain an exception on getvalue(): 'ascii' codec
can't decode byte 0xe2 in position 0: ordinal not in range(128).
This exception raises because of text different types that wrote in
StringIO.

Wikitext:
  ''☭''

or Wikitext:
  <i>☭</i>

works correctly, because this processed by Python:
  self.out.write("<%s>" % tag)

which writes not unicode.
But <h1>,..  processed by HTMLWriter::writeTagNode, Python:
  self.out.write(t.starttext)

where t.starttext is unicode.

But exception causes other text in HTMLWriter::_write, Python:
        self.out.write(cgi.escape(s.encode("utf8")))

This writes non unicode, which is incorrect in general. I think
Python:
        unicode(self.out.write(cgi.escape(s.encode("utf8"))),'utf-8')

must be here.

Or maybe I did something wrong?

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to