I'm having the same problem on GAE... I get these erros when a *string* data field have characters like é, á, ç, õ even using this UTF class...
It's working for *text* fields... Any clues? Regards, Tito On Wed, Jul 2, 2008 at 10:08 AM, Massimo Di Pierro <[email protected]>wrote: > > Your english is not bad at all. We are working to address these > issue. It would be great if you can help us with that. > > On Jul 1, 2008, at 9:08 PM, Abner wrote: > > > > > Massimo, > > > > On 1 jul, 01:21, Massimo Di Pierro <[email protected]> wrote: > >> I do not like this solution. I would like something that that goes > >> into gluon/main.py so it is transparent. > > > > Me too. My solution was only an hack to temporarily solve my need to > > test the application on GAE. :) > > > >> I also do not want to use chardet since web2py only relies on basic > >> modules. > > > > Good point, I agree. > > > >> Could you help me understand the problem? > >> When you have foreign characters in the input, which lines in the > >> function f get executed? which encoding is detected on GAE? > > > > As I said before, the class TO_UTF I got from this post: > > http://groups.google.com/group/web2py/browse_thread/thread/ > > 711fe9716ce39b77/6b556d5b4c656134?lnk=gst&q=chardet#6b556d5b4c656134 > > > > Ga's post make an reference to this issue on Google app Engine: > > http://code.google.com/p/googleappengine/issues/detail?id=155 > > > > Another issues of interest can be: > > > > http://code.google.com/p/googleappengine/issues/detail?id=157 > > http://code.google.com/p/googleappengine/issues/detail?id=538 > > http://code.google.com/p/googleappengine/issues/detail?id=376 > > http://code.google.com/p/googleappengine/issues/list? > > can=2&q=unicode&colspec=ID+Type+Status+Priority+Stars+Owner > > +Summary&cells=tiles > > > > I'm poor python programmer, my background is more using PHP (happily > > changing to python and web2py) and little bits of C. > > I was testing my app but can't get prints debug to appear in the gae/ > > web2py console. How is the best method to debug an app using web2py, > > or, how to prints steps of web2py execution code in the console ? > > > >> What > >> would happen if the funciton f were to re-encode in UTF8 and return > >> UTF8-encoded string? > > > > Works fine. > > I created a new record using my hack, changed it using web2py, changed > > it using http://localhost:8080/_ah/admin, changed again using web2py. > > In every step the content was saved and displayed correctly. Function > > 'f' re-enconding contend saved before using it worked fine. > > > > End note.: My English is bad, sorry for any mistake > > > >> > >> Massimo > >> > >> On Jun 30, 2008, at 6:10 PM, Abner wrote: > >> > >> > >> > >>> Massimo, > >> > >>> I tested here an hack to solve, in an more generic form, the > >>> problem: > >> > >>> in model: > >> > >>> # as default we are not runing on GAE > >>> RUNNINGINGAE = False > >> > >>> try: > >>> from gluon.contrib.gql import * > >>> db=GQLDB() > >>> # Ok, we are on GAE > >>> RUNNINGINGAE = True > >>> except: > >>> db=SQLDB("sqlite://jomeme1.db") > >>> session.connect(request,response,db=db) > >> > >>> # Thanks to Ga for this class > >>> class TO_UTF: > >>> def __init__(self,f): self.f=f > >>> def __call__(self,value): return (self.f(value),None) > >> > >>> def f(v): > >>> if isinstance(v, str): > >>> try: > >>> v = v.decode('utf-8') > >>> return v > >>> except UnicodeDecodeError: > >>> import chardet > >>> info = chardet.detect(v) > >>> try: > >>> v = v.decode(info['encoding']) > >>> return v > >>> except UnicodeDecodeError, e: > >>> raise UnicodeDecodeError("%s (tried UTF-8, %s)" % > >>> (e,info['encoding'])) > >> > >>> db.define_table('sites', > >> > >>> SQLField > >>> ('domain',length=256,required=True,default='localhost',unique=True), > >>> SQLField('logo',length=256,required=True,default='none.gif'), > >>> SQLField('telefone',length=36,required=True,default='0xx 99 > >>> 9999-9999'), > >>> SQLField('endereco',length=64,required=True,default='Rua > >>> XXXXXXXXXXXXXXXXXX'), > >>> SQLField('end_num','integer',required=True,default=9999), > >>> SQLField('end_compl',length=16,default=''), > >>> SQLField('bairro',length=32,required=True,default='Centro'), > >>> SQLField('cidade',length=32,required=True,default='Vitória'), > >>> SQLField('estado',length=2,required=True,default='ES'), > >>> SQLField('cep',length=9,required=True,default='29000-000'), > >> > >>> SQLField > >>> ('email_contato',length=128,required=True,default='[email protected]') > >>> ) > >> > >>> # Some generic requires used in GAE or not, note the second, it > >>> don't > >>> use [] > >>> db.sites.domain.requires=[IS_NOT_EMPTY(), IS_NOT_IN_DB(db, > >>> 'sites.domain')] > >>> db.sites.end_compl.requires=IS_NOT_EMPTY() > >> > >>> # Now, if in GAE, we add TO_UTF in every text or string field > >>> if RUNNINGINGAE: > >>> from types import ListType > >>> for fieldname in db.sites.fields: > >>> if fieldname!='id' and (db.sites[fieldname].type == > >>> 'string' or > >>> db.sites[fieldname].type == 'text'): > >>> # For requires defined above without [] or > >>> empties requires > >>> if not isinstance(db.sites > >>> [fieldname].requires, ListType): > >>> # Empty require - Is this the best > >>> method to do it ? > >>> if not db.sites[fieldname].requires: > >>> db.sites > >>> [fieldname].requires = TO_UTF(f) > >>> # Required defined without [] > >>> else: > >>> tmp = db.sites > >>> [fieldname].requires > >>> db.sites > >>> [fieldname].requires = [] > >>> db.sites > >>> [fieldname].requires.append(tmp) > >>> db.sites > >>> [fieldname].requires.append(TO_UTF(f)) > >>> # For requires defined above using [] > >>> else: > >>> db.sites[fieldname].requires.append > >>> (TO_UTF(f)) > >> > >>> What you think about this solution ? > >> > >>> regards, > >> > >>> abner > >> > >>> On 30 jun, 18:05, Abner <[email protected]> wrote: > >>>> Massimo, > >> > >>>> Your code don't work, but, I find in the group posts this code from > >>>> GA, and it worked fine: > >> > >>>> class TO_UTF: > >>>> def __init__(self,f): self.f=f > >>>> def __call__(self,value): return (self.f(value),None) > >> > >>>> def f(v): > >>>> if isinstance(v, str): > >>>> try: > >>>> v = v.decode('utf-8') > >>>> return v > >>>> except UnicodeDecodeError: > >>>> import chardet > >>>> info = chardet.detect(v) > >>>> try: > >>>> v = v.decode(info['encoding']) > >>>> return v > >>>> except UnicodeDecodeError, e: > >>>> raise UnicodeDecodeError("%s (tried UTF-8, %s)" % > >>>> (e,info['encoding'])) > >> > >>>> ..... > >> > >>>> db.sites.cidade.requires=TO_UTF(f) > >> > >>>> The problem is that I need to add an require to every text or > >>>> string > >>>> field in my model. This UnicodeDecodeError can be very annoying for > >>>> every user using languages other than English, the ideal > >>>> solution, I > >>>> think, is to add in gql.py some code to process every string or > >>>> text > >>>> field going to be stored in the datastore. How I can do this ? > >> > >>>> Thanks > >> > >>>> Abner > >> > >>>> On 30 jun, 17:22, Massimo Di Pierro <[email protected]> > >>>> wrote: > >> > >>>>> I do not know why the input data is not unicode. It is supposed > >>>>> to be > >>>>> UTF-8. > >> > >>>>> Try this validator for your text fields > >> > >>>>> rus_unicode = [ u'\u0410', u'\u0411', u'\u0412', u'\u0413', > >>>>> u'\u0414', > >>>>> u'\u0415', u'\u0416', u'\u0417', u'\u0418', u'\u0419', u'\u041a', > >>>>> u'\u041b', u'\u041c', u'\u041d', u'\u041e', u'\u041f', u'\u0420', > >>>>> u'\u0421', u'\u0422', u'\u0423', u'\u0424', u'\u0425', u'\u0426', > >>>>> u'\u0427', u'\u0428', u'\u0429', u'\u042a', u'\u042b', u'\u042c', > >>>>> u'\u042d', u'\u042e', u'\u042f', u'\u0430', u'\u0431', u'\u0432', > >>>>> u'\u0433', u'\u0434', u'\u0435', u'\u0436', u'\u0437', u'\u0438', > >>>>> u'\u0439', u'\u043a', u'\u043b', u'\u043c', u'\u043d', u'\u043e', > >>>>> u'\u043f', u'\u0440', u'\u0441', u'\u0442', u'\u0443', u'\u0444', > >>>>> u'\u0445', u'\u0446', u'\u0447', u'\u0448', u'\u0449', u'\u044a', > >>>>> u'\u044b', u'\u044c', u'\u044d', u'\u044e', u'\u044f'] > >> > >>>>> class GAE_FIX: > >>>>> def __call__(self,value): > >>>>> result = "" > >>>>> for i in range(0, len(s)): > >>>>> if ord(s[i])<128: > >>>>> result = result + unicode(s[i]) > >>>>> elif ord(s[i])==184: > >>>>> result = result + unichr(0x0451) > >>>>> elif ord(s[i])==168: > >>>>> result = result + unichr(0x0401) > >>>>> elif ord(s[i])>=192: > >>>>> result = result + rus_unicode[ord(s[i])-192] > >>>>> else: > >>>>> result = result + unicode(" ") > >>>>> return (result.encode('utf8'),None) > >> > >>>>> Use it with requires=[GAE_FIX(), other validators, ....] > >>>>> Perhaps other users have better suggestions. > >> > >>>>> On Jun 30, 2008, at 2:48 PM, Abner wrote: > >> > >>>>>> Hi, > >> > >>>>>> I'm from Brasil developing ans small application with WEB2Py to > >>>>>> run in > >>>>>> GAE. > >> > >>>>>> When I try to add new data to the GAE datastore, using the local > >>>>>> version or the hosted on google, I get these erros when some data > >>>>>> field have characters like é, á, ç, õ: > >> > >>>>>> ERROR 2008-06-30 19:40:32,922 __init__.py] Traceback (most > >>>>>> recent > >>>>>> call last): > >>>>>> File "/home/abner/devel/python/gae/google_appengine/web2py/ > >>>>>> gluon/ > >>>>>> restricted.py", line 62, in restricted > >>>>>> exec ccode in environment > >>>>>> File "/home/abner/devel/python/gae/google_appengine/web2py/ > >>>>>> applications/jomeme1/controllers/panel.py", line 49, in <module> > >>>>>> File "/home/abner/devel/python/gae/google_appengine/web2py/ > >>>>>> applications/jomeme1/controllers/panel.py", line 41, in blocks > >>>>>> if form.accepts(request.vars,session): > >>>>>> File "/home/abner/devel/python/gae/google_appengine/web2py/ > >>>>>> gluon/ > >>>>>> sqlhtml.py", line 223, in accepts > >>>>>> self.vars.id=self.table.insert(**fields) > >>>>>> File "/home/abner/devel/python/gae/google_appengine/web2py/ > >>>>>> gluon/ > >>>>>> contrib/gql.py", line 169, in insert > >>>>>> tmp=self._tableobj(**fields) > >>>>>> File "google/appengine/ext/db/__init__.py", line 555, in > >>>>>> __init__ > >>>>>> File "google/appengine/ext/db/__init__.py", line 372, in __set__ > >>>>>> File "google/appengine/ext/db/__init__.py", line 1583, in > >>>>>> validate > >>>>>> File "/home/abner/devel/python/gae/google_appengine/google/ > >>>>>> appengine/ > >>>>>> api/datastore_types.py", line 816, in __new__ > >>>>>> return super(Text, cls).__new__(cls, arg, encoding) > >>>>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in > >>>>>> position > >>>>>> 70: ordinal not in range(128) > >> > >>>>>> My controller is this: > >> > >>>>>> def sites(): > >>>>>> message = None > >>>>>> response.view = 'panel/sites.html' > >>>>>> form=SQLFORM(db.sites) > >>>>>> if form.accepts(request.vars,session): > >>>>>> response.flash="form accepted" > >>>>>> elif form.errors: > >>>>>> response.flash="form is invalid" > >>>>>> else: > >>>>>> message = "Por favor, preencha o formulário" > >>>>>> return dict(message=message, form=form,vars=form.vars) > >> > >>>>>> My model; > >> > >>>>>> db.define_table('sites', > >> > >>>>>> SQLField > >>>>>> ('domain',length=256,required=True,default='localhost',unique=Tru > >>>>>> e) > >>>>>> , > >>>>>> SQLField('logo',length=256,required=True,default='none.gif'), > >>>>>> SQLField('telefone',length=36,required=True,default='0xx 99 > >>>>>> 9999-9999'), > >>>>>> SQLField('endereco',length=64,required=True,default='Rua > >>>>>> XXXXXXXXXXXXXXXXXX'), > >>>>>> SQLField('end_num','integer',required=True,default=9999), > >>>>>> SQLField('end_compl',length=16,default=''), > >>>>>> SQLField('bairro',length=32,required=True,default='Centro'), > >>>>>> SQLField('cidade',length=32,required=True,default='Vitória'), > >>>>>> SQLField('estado',length=2,required=True,default='ES'), > >>>>>> SQLField('cep',length=9,required=True,default='29000-000'), > >> > >>>>>> SQLField > >>>>>> ('email_contato',length=128,required=True,default='[email protected] > >>>>>> '), > >> > >>>>>> The problem also occurs in another table using field of type > >> > >> ... > >> > >> mais » > > > > > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google Groups > "web2py Web Framework" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/web2py?hl=en > -~----------~----~----~----~------~----~------~--~--- > > -- Linux User #387870 .........____ .... _/_õ|__| ..º[ .-.___.-._| . . . . .__( o)__( o).:_______ -- You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.

