The book states (2.4.1) "... By design, web2py uses UTF8 encoded strings internally. ..."
So I guess that the text returned by the server should be encoded to UTF-8 in order to avoid string handling errors when rendering for browser output. On 19 ene, 14:44, Alan Etkin <[email protected]> wrote: > The text encoding is read from an email.message.Message created when > the mail is fetched from the server and before sending the data to the > base adapter parse function (by the way, I sent new versions of the > adapter these days to the issue page after it was marked as fixed) > > In IMAPAdapter I am passing the complete message RFC822 payload to > aunicodeinstance with the charset declared in the message's envelope > (or using "utf-8" as default), and this way theunicodeerrordoesn't > reproduce (without need to change thesqlhtml.py module). > > I did not edit the layout encoding, I am just using the scaffolding > app to test the email queries. The message rows passed to sqltables > might contain html, is it possible that these parts are > producingunicodeerrors? > > On 19 ene, 12:55, Massimo Di Pierro <[email protected]> > wrote: > > > Is the page html header declaring the utf8 encoding or are using in a > > layout that uses a different encoding? > > > On Jan 19, 7:21 am, Alan Etkin <[email protected]> wrote: > > > > I found that theUnicodeerrors are originated because of incompatible > > > encodings when web2py tries to read the raw message and render the > > > data for browser output. I solved it encoding the RFC822 raw text > > > before parsing the response data as Rows. Still i am not sure if this > > > is the correct way for processing the response text so it can be sent > > > safely (without misread characters) to the user interface. Anyway, it > > > seems to work well, without instensive testing. > > > > On 18 ene, 18:40, Alan Etkin <[email protected]> wrote: > > > > > I am trying to generate sqltables with the experimental IMAPAdapter > > > > select output, but with some rows, an exception is raised at the > > > >sqlhtmlmodule. It has to do withunicodeand charsets whensqlhtml > > > > processes the rows object returned by the adapter's select method, but > > > > I cannot find a proper way of solving it. Here is theerrortrace: > > > > > Traceback (most recent call last): > > > > File "/home/alan/web2py-hg/gluon/restricted.py", line 204, in > > > > restricted > > > > exec ccode in environment > > > > File "/home/alan/web2py-hg/applications/queries/views/default/ > > > > index.html", line 126, in <module> > > > > File "/home/alan/web2py-hg/gluon/globals.py", line 181, in write > > > > self.body.write(xmlescape(data)) > > > > File "/home/alan/web2py-hg/gluon/html.py", line 114, in xmlescape > > > > return data.xml() > > > > File "/home/alan/web2py-hg/gluon/dal.py", line 7442, in xml > > > > returnsqlhtml.SQLTABLE(self).xml() > > > > File "/home/alan/web2py-hg/gluon/sqlhtml.py", line 2197, in __init__ > > > > ur =unicode(r, 'utf8') > > > > File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode > > > > return codecs.utf_8_decode(input, errors, True) > > > > UnicodeDecodeError: 'utf8' codec can't decode bytes in position > > > > 1227-1229: invalid data > > > > > I found a workaround to avoid the exception but I doubt it's the > > > > correct fix, because it just prevents web2py to create theunicode > > > > object and use the raw input instead. > > > > > This is the workaround: (gluon/sqlhtml.py Line 2196) > > > > > try: > > > > ur =unicode(r, 'utf-8') > > > > except UnicodeDecodeError, e: > > > > ur = r > > > > > Replacing this line: > > > > ur =unicode(r, 'utf8') > > > > > When creating the Row objects at the adapter, I have to handle > > > > different encodings depending on the message. What would be an > > > > appropiate way of encoding data before creating the Row objects, so > > > >unicodeerrors can be avoided? > > > > > This is the adapter method i am using to store the parse input for > > > > each text field > > > > > def encode_text(self, text, charset, errors="replace"): > > > > """ convert text for mail tounicode""" > > > > if text is None: > > > > text = "" > > > > else: > > > > if charset is not None: > > > > text =unicode(text, charset, errors) > > > > else: > > > > text =unicode(text, "utf-8", errors) > > > > return text > > > > > Thanks > > > > > I am using the last source hg version (1.99.4) with Python 2.6.5 on a > > > > Mandriva GNU/Linux machine. > >

