Hi John I've been using libapreq, which has a charset method: http://search.cpan.org/~joesuf/libapreq2-2.08/glue/perl/xsbuilder/APR/Request/Param/Param.pod#charset
It is fairly limited, it recognises: 0 APREQ_CHARSET_ASCII (7-bit us-ascii) 1 APREQ_CHARSET_LATIN1 (8-bit iso-8859-1) 2 APREQ_CHARSET_CP1252 (8-bit Windows-1252) 8 APREQ_CHARSET_UTF8 (utf8 encoded Unicode) but this has been working fine for me on IE 6, 7, Firefox and Opera. I think (not sure) that these more modern browsers do try to respect the character set of the web page. It hasn't been tested to the point that I am certain that it works every time, but I've had no problems with it over the last year of use. Don't forget the other part, which is that, if you put UTF8 into the database, you may need to reset the UTF8 flag when you get the data back again. The new DBD::MySQL driver has added this automatically, but I haven't tried it - I've been using my own wrapper on an older driver which I know works. Not sure about other drivers, but (again) I "think" there is reasonable support for UTF8 on the more popular ones. Once you're happy with the fact that the data coming in and out of your system is UTF8, it makes life a lot easier. Things like filtering input data with \w just work. good luck Clint > Perl: > use Encode; > sub handler { > my $r=shift; > my $q=Apache2::Request->new($r); > my $known_to_be_utf8 = $q->param('test'); # form post doesn't > give charset, none assumed > my $utf8_aware_string = decode_utf8( $known_to_be_utf8 ); > ...... > # the above works (we get our data back in one piece) > # and of course the HTML entities have been turned into UTF-8 chars > } > > I tried some form attributes: > enctype="multipart/form-data" - this doesn't specify a charset in > the content-type headers (tried IE6 and FF) > accept-charset="utf-8" - no change for me (as no charset > transformation required) > > So there's no way for the server to know what charset the parameters are > in, the application has to know what to expect. > > Any thoughts? > > cheers > John >