Re: [cgiapp] Re: utf8 form processing
the CGI::charset() is useful when I use escapeHTML() to escape form values, in CGI.pm : sub escapeHTML { ... ... my $latin = uc $self-{'.charset'} eq 'ISO-8859-1' || uc $self-{'.charset'} eq 'WINDOWS-1252'; if ($latin) { # bug in some browsers $toencode =~ s{'}{#39;}gso; $toencode =~ s{\x8b}{#8249;}gso; $toencode =~ s{\x9b}{#8250;}gso; if (defined $newlinestoo $newlinestoo) { $toencode =~ s{\012}{#10;}gso; $toencode =~ s{\015}{#13;}gso; } } 2008/10/22 Rhesa Rozendaal [EMAIL PROTECTED]: Silent wrote: when I met a bad character problem, I use this $cap-query()-charset('utf-8') in cgiapp_init() Yeah, that's also pretty much a prerequisite for outputting proper utf8-encoded pages. However, it only affects output AFAIK, so you need both pieces to this puzzle. rhesa # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## #### # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Re: utf8 form processing
Hi Rhesa, Yes I tries the -utf8 switch for the CGI module, and while it didn't break the code in any way, it simply didn't seem to do anything. I did wonder if it could be the use of require instead of use, but I don't really understand the difference and / or how this affects C::A. Seems like your code to avoid decoding file uploads would be a good addition to CGI.pm though - I got the impression it decodes all params and destroys file uploads if used in this way. mike # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Re: utf8 form processing
Mike Tonks wrote: Yes I tries the -utf8 switch for the CGI module, and while it didn't break the code in any way, it simply didn't seem to do anything. How were you doing this? Since CGI::Application loads CGI.pm by itself if your loading comes after that it won't override what was already done. Since you were using require it's quite possible that your -utf8 flagged was ignored since CGI.pm had already been loaded by C::A I did wonder if it could be the use of require instead of use, but I don't really understand the difference and / or how this affects C::A. Not to be too mean, but this is a pretty fundamental thing to understand. use == compile time require == run time This means that when you say use CGI it happens as soon as perl *parses* that statement. When you say require CGI it happens as soon as perl *executes* that statement. If you're using CGI on every request then there's no reason to do it via require. In fact unless you're conditionally loading a module there's no reason (unless you're doing something sufficiently magical) to use require it. -- Michael Peters Plus Three, LP # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Re: utf8 form processing
when I met a bad character problem, I use this $cap-query()-charset('utf-8') in cgiapp_init() # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Re: utf8 form processing
Silent wrote: when I met a bad character problem, I use this $cap-query()-charset('utf-8') in cgiapp_init() Yeah, that's also pretty much a prerequisite for outputting proper utf8-encoded pages. However, it only affects output AFAIK, so you need both pieces to this puzzle. rhesa # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Re: utf8 form processing
Mark Stosberg wrote: On Wed, 15 Oct 2008 17:11:34 +0200 Rhesa Rozendaal [EMAIL PROTECTED] wrote: Mike Tonks wrote: Hi All, I recently encountered the dreaded utf8 funny characters, again. This time on the input data coming from form entry fields. Here's what I use: [...] my $might_decode = sub { my $p = shift; return ( !$p || ( ref $p fileno($p) ) ) ? $p : eval { decode_utf8($p) } || $p; }; That looks useful, Rhesa. Is there a variation of it that makes sense to submit as patch for CGI.pm? I hadn't considered that. The more recent -utf8 looks like it does the same thing: # in CGI-param my @result = @{$self-{param}{$name}}; if ($PARAM_UTF8) { eval require Encode; 1; unless Encode-can('decode'); # bring in these functions @result = map {ref $_ ? $_ : Encode::decode(utf8=$_) } @result; } The only differences I can see is that * I don't try to decode false values * I do try to decode values that are references, but not filenos * I wrap the decode in an eval I have a hard time imagining the first two would break Mike's code, but he said it didn't work for him. Would it have been the lack of eval? rhesa # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
[cgiapp] Re: utf8 form processing
On Wed, 15 Oct 2008 17:11:34 +0200 Rhesa Rozendaal [EMAIL PROTECTED] wrote: Mike Tonks wrote: Hi All, I recently encountered the dreaded utf8 funny characters, again. This time on the input data coming from form entry fields. It's CGI.pm that actually does the processing, and needs to read the stream as utf8. There is a flag for this, but I couldn't get that to work, so as a temporary measure I read all the parameters and pass them through decode_utf8. Does anyone have a better method? Here's what I use: package CGI::as_utf; BEGIN { use strict; use warnings; use CGI; use Encode; { no warnings 'redefine'; my $param_org = \CGI::param; my $might_decode = sub { my $p = shift; return ( !$p || ( ref $p fileno($p) ) ) ? $p : eval { decode_utf8($p) } || $p; }; *CGI::param = sub { my $q = $_[0];# assume object calls always my $p = $_[1]; goto $param_org if scalar @_ != 2; return wantarray ? map { $might_decode-($_) } $q-$param_org($p) : $might_decode-( $q-$param_org($p) ); } } } 1; This does the right thing for file uploads, as well as handling scalar and list context. That looks useful, Rhesa. Is there a variation of it that makes sense to submit as patch for CGI.pm? Mark # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####
Re: [cgiapp] Re: utf8 form processing
That would be nice indeed, and perhaps a little switch in C::A to enable it? 2008/10/16 Mark Stosberg [EMAIL PROTECTED]: On Wed, 15 Oct 2008 17:11:34 +0200 Rhesa Rozendaal [EMAIL PROTECTED] wrote: Mike Tonks wrote: Hi All, I recently encountered the dreaded utf8 funny characters, again. This time on the input data coming from form entry fields. It's CGI.pm that actually does the processing, and needs to read the stream as utf8. There is a flag for this, but I couldn't get that to work, so as a temporary measure I read all the parameters and pass them through decode_utf8. Does anyone have a better method? Here's what I use: package CGI::as_utf; BEGIN { use strict; use warnings; use CGI; use Encode; { no warnings 'redefine'; my $param_org = \CGI::param; my $might_decode = sub { my $p = shift; return ( !$p || ( ref $p fileno($p) ) ) ? $p : eval { decode_utf8($p) } || $p; }; *CGI::param = sub { my $q = $_[0];# assume object calls always my $p = $_[1]; goto $param_org if scalar @_ != 2; return wantarray ? map { $might_decode-($_) } $q-$param_org($p) : $might_decode-( $q-$param_org($p) ); } } } 1; This does the right thing for file uploads, as well as handling scalar and list context. That looks useful, Rhesa. Is there a variation of it that makes sense to submit as patch for CGI.pm? Mark # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## #### # CGI::Application community mailing list #### ## To unsubscribe, or change your message delivery options, ## ## visit: http://www.erlbaum.net/mailman/listinfo/cgiapp## #### ## Web archive: http://www.erlbaum.net/pipermail/cgiapp/ ## ## Wiki: http://cgiapp.erlbaum.net/ ## ####