[mwlib] Re: unicode problem

Joseph Turian Mon, 23 Feb 2009 16:42:36 -0800

On Feb 9, 5:34 am, Osipov <[email protected]> wrote:
> > This writes non unicode, which is incorrect in general. I think
> > Python:
> >         unicode(self.out.write(cgi.escape(s.encode("utf8"))),'utf-8')
>
> > must be here.
>
> Oh, I'm sorry, Python:
>   self.out.write(unicode(cgi.escape(s.encode("utf8")),'utf8'))
>
> is correct. And it works correctly with all my examples.

I am trying to use the following code to convert 'wikitext' (which is
utf-8) to HTML.

    out=StringIO.StringIO()
    a=uparser.parseString(j.find("title").text, raw=wikitext,
wikidb=dummydb.DummyDB())
    w=htmlwriter.HTMLWriter(out, None)
    w.write(a)
    html = out.getvalue()

However, I get similar unicode errors as you:

Traceback (most recent call last):
  File "./extract-descriptions.py", line 83, in <module>
    html = out.getvalue()
  File "/usr/lib64/python2.5/StringIO.py", line 270, in getvalue
    self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
80: ordinal not in range(128)

How do I avoid these errors?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mwlib" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/mwlib?hl=en
-~----------~----~----~----~------~----~------~--~---
[mwlib] Re: unicode problem

Reply via email to