Re: [cgiapp] Re: utf8 form processing

2008-10-22 Thread Silent
the CGI::charset() is useful when I use escapeHTML() to escape form values,

in CGI.pm :

sub escapeHTML {
...
...
 my $latin = uc $self-{'.charset'} eq 'ISO-8859-1' ||
 uc $self-{'.charset'} eq 'WINDOWS-1252';
 if ($latin) {  # bug in some browsers
$toencode =~ s{'}{#39;}gso;
$toencode =~ s{\x8b}{#8249;}gso;
$toencode =~ s{\x9b}{#8250;}gso;
if (defined $newlinestoo  $newlinestoo) {
 $toencode =~ s{\012}{#10;}gso;
 $toencode =~ s{\015}{#13;}gso;
}
 }





2008/10/22 Rhesa Rozendaal [EMAIL PROTECTED]:
 Silent wrote:

 when I met a bad character problem, I use this
 $cap-query()-charset('utf-8')  in cgiapp_init()

 Yeah, that's also pretty much a prerequisite for outputting proper
 utf8-encoded pages. However, it only affects output AFAIK, so you need both
 pieces to this puzzle.

 rhesa

 #  CGI::Application community mailing list  
 ####
 ##  To unsubscribe, or change your message delivery options,  ##
 ##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
 ####
 ##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
 ##  Wiki:  http://cgiapp.erlbaum.net/ ##
 ####
 



#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




Re: [cgiapp] Re: utf8 form processing

2008-10-21 Thread Mike Tonks
Hi Rhesa,

Yes I tries the -utf8 switch for the CGI module, and while it didn't
break the code in any way, it simply didn't seem to do anything.

I did wonder if it could be the use of require instead of use, but I
don't really understand the difference and / or how this affects C::A.

Seems like your code to avoid decoding file uploads would be a good
addition to CGI.pm though - I got the impression it decodes all params
and destroys file uploads if used in this way.

mike

#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




Re: [cgiapp] Re: utf8 form processing

2008-10-21 Thread Michael Peters

Mike Tonks wrote:


Yes I tries the -utf8 switch for the CGI module, and while it didn't
break the code in any way, it simply didn't seem to do anything.


How were you doing this? Since CGI::Application loads CGI.pm by itself if your loading comes after 
that it won't override what was already done. Since you were using require it's quite possible 
that your -utf8 flagged was ignored since CGI.pm had already been loaded by C::A



I did wonder if it could be the use of require instead of use, but I
don't really understand the difference and / or how this affects C::A.


Not to be too mean, but this is a pretty fundamental thing to understand.
use == compile time
require == run time

This means that when you say use CGI it happens as soon as perl *parses* that statement. When you 
say require CGI it happens as soon as perl *executes* that statement. If you're using CGI on every 
request then there's no reason to do it via require. In fact unless you're conditionally loading a 
module there's no reason (unless you're doing something sufficiently magical) to use require it.


--
Michael Peters
Plus Three, LP


#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




Re: [cgiapp] Re: utf8 form processing

2008-10-21 Thread Silent
when I met a bad character problem, I use this
$cap-query()-charset('utf-8')  in cgiapp_init()

#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




Re: [cgiapp] Re: utf8 form processing

2008-10-21 Thread Rhesa Rozendaal

Silent wrote:

when I met a bad character problem, I use this
$cap-query()-charset('utf-8')  in cgiapp_init()


Yeah, that's also pretty much a prerequisite for outputting proper 
utf8-encoded pages. However, it only affects output AFAIK, so you need both 
pieces to this puzzle.


rhesa

#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




Re: [cgiapp] Re: utf8 form processing

2008-10-20 Thread Rhesa Rozendaal

Mark Stosberg wrote:

On Wed, 15 Oct 2008 17:11:34 +0200
Rhesa Rozendaal [EMAIL PROTECTED] wrote:


Mike Tonks wrote:

Hi All,

I recently encountered the dreaded utf8 funny characters, again.  This
time on the input data coming from form entry fields.


Here's what I use:


[...]
  my $might_decode = sub {
  my $p = shift;
  return ( !$p || ( ref $p  fileno($p) ) )
  ? $p
  : eval { decode_utf8($p) } || $p;
  };


That looks useful, Rhesa.

Is there a variation of it that makes sense to submit as patch for CGI.pm?


I hadn't considered that. The more recent -utf8 looks like it does the same 
thing:


# in CGI-param
  my @result = @{$self-{param}{$name}};

  if ($PARAM_UTF8) {
eval require Encode; 1; unless Encode-can('decode'); # bring in these 
functions

@result = map {ref $_ ? $_ : Encode::decode(utf8=$_) } @result;
  }

The only differences I can see is that
* I don't try to decode false values
* I do try to decode values that are references, but not filenos
* I wrap the decode in an eval

I have a hard time imagining the first two would break Mike's code, but he 
said it didn't work for him. Would it have been the lack of eval?


rhesa

#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




[cgiapp] Re: utf8 form processing

2008-10-16 Thread Mark Stosberg
On Wed, 15 Oct 2008 17:11:34 +0200
Rhesa Rozendaal [EMAIL PROTECTED] wrote:

 Mike Tonks wrote:
  Hi All,
  
  I recently encountered the dreaded utf8 funny characters, again.  This
  time on the input data coming from form entry fields.
  
  It's CGI.pm that actually does the processing, and needs to read the
  stream as utf8.  There is a flag for this, but I couldn't get that to
  work, so as a temporary measure I read all the parameters and pass
  them through decode_utf8.  Does anyone have a better method?
 
 Here's what I use:
 
 package CGI::as_utf;
 
 BEGIN
 {
  use strict;
  use warnings;
  use CGI;
  use Encode;
 
  {
  no warnings 'redefine';
  my $param_org = \CGI::param;
 
  my $might_decode = sub {
  my $p = shift;
  return ( !$p || ( ref $p  fileno($p) ) )
  ? $p
  : eval { decode_utf8($p) } || $p;
  };
 
  *CGI::param = sub {
  my $q = $_[0];# assume object calls always
  my $p = $_[1];
 
  goto $param_org if scalar @_ != 2;
 
  return wantarray
  ? map { $might_decode-($_) } $q-$param_org($p)
  : $might_decode-( $q-$param_org($p) );
  }
  }
 }
 
 1;
 
 This does the right thing for file uploads, as well as handling scalar and 
 list context.

That looks useful, Rhesa.

Is there a variation of it that makes sense to submit as patch for CGI.pm?

Mark



#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####




Re: [cgiapp] Re: utf8 form processing

2008-10-16 Thread Mike Tonks
That would be nice indeed, and perhaps a little switch in C::A to enable it?

2008/10/16 Mark Stosberg [EMAIL PROTECTED]:
 On Wed, 15 Oct 2008 17:11:34 +0200
 Rhesa Rozendaal [EMAIL PROTECTED] wrote:

 Mike Tonks wrote:
  Hi All,
 
  I recently encountered the dreaded utf8 funny characters, again.  This
  time on the input data coming from form entry fields.
 
  It's CGI.pm that actually does the processing, and needs to read the
  stream as utf8.  There is a flag for this, but I couldn't get that to
  work, so as a temporary measure I read all the parameters and pass
  them through decode_utf8.  Does anyone have a better method?

 Here's what I use:

 package CGI::as_utf;

 BEGIN
 {
  use strict;
  use warnings;
  use CGI;
  use Encode;

  {
  no warnings 'redefine';
  my $param_org = \CGI::param;

  my $might_decode = sub {
  my $p = shift;
  return ( !$p || ( ref $p  fileno($p) ) )
  ? $p
  : eval { decode_utf8($p) } || $p;
  };

  *CGI::param = sub {
  my $q = $_[0];# assume object calls always
  my $p = $_[1];

  goto $param_org if scalar @_ != 2;

  return wantarray
  ? map { $might_decode-($_) } $q-$param_org($p)
  : $might_decode-( $q-$param_org($p) );
  }
  }
 }

 1;

 This does the right thing for file uploads, as well as handling scalar and
 list context.

 That looks useful, Rhesa.

 Is there a variation of it that makes sense to submit as patch for CGI.pm?

Mark



 #  CGI::Application community mailing list  
 ####
 ##  To unsubscribe, or change your message delivery options,  ##
 ##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
 ####
 ##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
 ##  Wiki:  http://cgiapp.erlbaum.net/ ##
 ####
 



#  CGI::Application community mailing list  
####
##  To unsubscribe, or change your message delivery options,  ##
##  visit:  http://www.erlbaum.net/mailman/listinfo/cgiapp##
####
##  Web archive:   http://www.erlbaum.net/pipermail/cgiapp/   ##
##  Wiki:  http://cgiapp.erlbaum.net/ ##
####