On Sat, Aug 1, 2015 at 6:31 AM, Stefan <maill...@s.profanter.me> wrote:
> Hi, > > if a URL parameter contains a Unicode character (e.g. > www.example.com/?param=%D6lso%DF which stands for param=Ölsoße), the > parameter is not correctly parsed as Unicode. > 4. This outputs for the example url: localhost:3000/?param=%D6lso%DF: > > [debug] $VAR1 = { > > 'param' => "\x{fffd}lso\x{fffd}e" > > }; > > [debug] $VAR1 = '\x{d6}lso\x{df}e'; > > > > > > As you can see, the first output only contains one equal character: > \x{fffd} which is obviously not the same as it should be: \x{d6}lso\x{df}e > \x{fffd} is the unicode replacement character used by Encode to replace invalid UTF-8 sequences you are passing in. Try this instead in your browser: ?param=Ölsoße And then print $c->request->parameters->{param} -- and if you check Encode::is_utf8( $param ) it should be true, too, indicating the param was decoded correctly into characters. Or if you prefer: perl -le 'use URI::Escape; print uri_escape( "Ölsoße" )' %C3%96lso%C3%9Fe so, ?param=%C3%96lso%C3%9Fe but most likely the browser will turn it back into ?param=Ölsoße If you really want to say you are using utf8 constant strings (i.e. "use utf8;"): $ perl -le 'use URI::Escape; use Encode; use utf8; use Encode; print uri_escape( encode_utf8( "Ölsoße" ) )' %C3%96lso%C3%9Fe or $ perl -le 'use URI::Escape; use Encode; use utf8; use Encode; print uri_escape_utf8( "Ölsoße" )' %C3%96lso%C3%9Fe All the same thing. -- Bill Moseley mose...@hank.org
_______________________________________________ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/