Tim, Thanks for the tip. Adding request.get.charset = 'utf8' seems to do the trick.
Brian McC On Mar 31, 6:19 pm, Tim Hoffman <[email protected]> wrote: > Hi > > Have a look at > webobhttp://pythonpaste.org/webob/reference.html#unicode-variables > and note if your running as an API server your > api consumers should probably be specifying the encoding in the their > headers. > > Also its difficult to blindly encode something toUTF-8as what they > send may not in fact be possible to encode > without stripping or translating some values to something else > entirely (for instance if some one send you UCS4) > which is something you will have to deal with on a case by case basis > for different encodings. > > You could then document which encoding schemes you directly support in > your api, and then get the consumers to > set the content type charsets correctly. > > Just my 2c worth > > Rgds > > T > > On Apr 1, 5:36 am, Brian <[email protected]> wrote: > > > The problem with that is that our system is an API server, so we can't > > assume that submitters are actually sending UTF. They usually are, but > > sometimes not. > > > On Mar 29, 4:07 pm, Joshua Smith <[email protected]> wrote: > > > > If you specifyUTF-8on the form page with a meta tag, you should only > > > getUTF-8in the input you receive. At least that's been my experience. > > > > On Mar 29, 2010, at 5:40 PM, Brian wrote: > > > > > Hello, > > > > > I am looking for a library or function that does the following (my one > > > > complaint about Python /GAE is that it does not provide an easy way to > > > > sanitize and transcode input to UTF). I have a function that does this > > > > pretty reliably, except when it breaks, and was wondering who else has > > > > dealt with this issue. > > > > > HINT TO FRIENDLY GOOGLE PEOPLE: it would be really nice if you offered > > > > an option to sanitize incoming form data so your app does not need to > > > > worry about encodings. You'd just assume you're being given properly > > > > decodedutf-8, with placeholder characters where decoding failed. > > > > Failing that, it'd be nice to have a sanitizer function you can call > > > > that knows how to test for and transcode from the most common > > > > encodings intoutf-8. I know Python supports a lot of different > > > > encodings, but it can be very time consuming to track this type of bug > > > > because it tends to happen sporadically when an usual string shows up > > > > in a request. > > > > > Thanks, > > > > > Brian McConnell > > > > > -- > > > > You received this message because you are subscribed to the Google > > > > Groups "Google App Engine" group. > > > > To post to this group, send email to [email protected]. > > > > To unsubscribe from this group, send email to > > > > [email protected]. > > > > For more options, visit this group > > > > athttp://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
