On 10/24/06, Ian Bicking <[EMAIL PROTECTED]> wrote:
> Jack Tihon wrote:
> > Hi,
> >
> > I've had some issues which initially seemed to be related to FormBuild
> > but now seem to fall squarely into paste/request.py.
> >
> > parse_formvars() from file paste/request.py is trying to add to the
> > formvars MultiDict in the folowing loop:
> >     ...
> >     if isinstance(fs.value, list):
> >         for name in fs.keys():
> >             values = fs[name]
> >             if not isinstance(values, list):
> >                 values = [values]
> >             for value in values:
> >                 if not value.filename:
> >                     value = value.value
> >                 formvars.add(name, value.decode('utf-8')) #the submitted
> > CGI was UTF-8
> >                 print "formvars.add invoked on (name, value)", name,  value
> >      ...
> >
> > My proposed modification above decodes the CGI form value as UTF-8. From
> > my reading of the code, file uploads are handled differently, so that
> > won't be an issue. What _is_ an issue, however, is that I'm assuming
> > UTF-8 input. Is there a cleaner way to do this? From my searching I've
> > learned that the 'accept-charset' attribute of the FORM tag can be used
> > to specify the allowed input charset. Is there a way to programmatically
> > get that value and decode appropriately?
>
> Sorry, I missed this before.  The encoding of forms can be a little
> tricky.  The form is generally encoded in the same character set as the
> page it is on.  Since the form can come from other sites served up with
> different character sets, it becomes tricky to figure out.
>
> Is there something in the request that shows the encoding?
> (CONTENT_TYPE == 'application/x-www-url-encoded-form; charset=utf8'?)
>
> I haven't done much testing around this, so I don't know.  This could
> potentially be done with a wrapper around MultiDict too, that lazily
> decodes the values.

This can be a frustrating subject.  Did you know if you set
accept-encodings="US-ASCII" in the form, but the user tries to submit
Japanese characters, Firefox will send them as HTML entities like
&#1234; whereas IE will ignore you and send UTF-8 (assuming your page
was originally in UTF-8)?  Bleh!

-jj

-- 
The one who gets the last laugh isn't the one who did the laughing,
but rather the one who did the writing.

_______________________________________________
Paste-users mailing list
[email protected]
http://webwareforpython.org/cgi-bin/mailman/listinfo/paste-users

Reply via email to