On Feb 9, 5:34 am, Osipov <[email protected]> wrote:
> > This writes non unicode, which is incorrect in general. I think
> > Python:
> > unicode(self.out.write(cgi.escape(s.encode("utf8"))),'utf-8')
>
> > must be here.
>
> Oh, I'm sorry, Python:
> self.out.write(unicode(cgi.escape(s.encode("utf8")),'utf8'))
>
> is correct. And it works correctly with all my examples.
I am trying to use the following code to convert 'wikitext' (which is
utf-8) to HTML.
out=StringIO.StringIO()
a=uparser.parseString(j.find("title").text, raw=wikitext,
wikidb=dummydb.DummyDB())
w=htmlwriter.HTMLWriter(out, None)
w.write(a)
html = out.getvalue()
However, I get similar unicode errors as you:
Traceback (most recent call last):
File "./extract-descriptions.py", line 83, in <module>
html = out.getvalue()
File "/usr/lib64/python2.5/StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
80: ordinal not in range(128)
How do I avoid these errors?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---