Tim,

Thanks for the tip. Adding request.get.charset = 'utf8' seems to do
the trick.

Brian McC

On Mar 31, 6:19 pm, Tim Hoffman <[email protected]> wrote:
> Hi
>
> Have a look at 
> webobhttp://pythonpaste.org/webob/reference.html#unicode-variables
> and note if your running as an API server your
> api consumers should probably be specifying the encoding in the their
> headers.
>
> Also its difficult to blindly encode something toUTF-8as what they
> send may not in fact be possible to encode
> without stripping or translating some values to something else
> entirely (for instance if some one send you UCS4)
> which is something you will have to deal with on a case by case basis
> for different encodings.
>
> You could then document which encoding schemes you directly support in
> your api, and then get the consumers to
> set the content type charsets correctly.
>
> Just my 2c worth
>
> Rgds
>
> T
>
> On Apr 1, 5:36 am, Brian <[email protected]> wrote:
>
> > The problem with that is that our system is an API server, so we can't
> > assume that submitters are actually sending UTF. They usually are, but
> > sometimes not.
>
> > On Mar 29, 4:07 pm, Joshua Smith <[email protected]> wrote:
>
> > > If you specifyUTF-8on the form page with a meta tag, you should only 
> > > getUTF-8in the input you receive.  At least that's been my experience.
>
> > > On Mar 29, 2010, at 5:40 PM, Brian wrote:
>
> > > > Hello,
>
> > > > I am looking for a library or function that does the following (my one
> > > > complaint about Python /GAE is that it does not provide an easy way to
> > > > sanitize and transcode input to UTF). I have a function that does this
> > > > pretty reliably, except when it breaks, and was wondering who else has
> > > > dealt with this issue.
>
> > > > HINT TO FRIENDLY GOOGLE PEOPLE: it would be really nice if you offered
> > > > an option to sanitize incoming form data so your app does not need to
> > > > worry about encodings. You'd just assume you're being given properly
> > > > decodedutf-8, with placeholder characters where decoding failed.
> > > > Failing that, it'd be nice to have a sanitizer function you can call
> > > > that knows how to test for and transcode from the most common
> > > > encodings intoutf-8. I know Python supports a lot of different
> > > > encodings, but it can be very time consuming to track this type of bug
> > > > because it tends to happen sporadically when an usual string shows up
> > > > in a request.
>
> > > > Thanks,
>
> > > > Brian McConnell
>
> > > > --
> > > > You received this message because you are subscribed to the Google 
> > > > Groups "Google App Engine" group.
> > > > To post to this group, send email to [email protected].
> > > > To unsubscribe from this group, send email to 
> > > > [email protected].
> > > > For more options, visit this group 
> > > > athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to