On Tue, 2012-06-26 at 18:35 +0200, A Kobame wrote: > 4.) > everything what is coming from Pack (bytes) should be decoded into > internal unicode-characters when entering to Mason. This is in the > Stackoveflow's answer totally wrong (stated as coming from you, Jon). > :) > > Wrong, because the %params are Hash::Multivalue and not a simple > %params as in the good-old Mason1. Unfortunately I havnen't any idea > whrere is the best place decode every element form the query. > > Reading Mason manual giving me an idea than that should be done > somewhere in the "render", in the content wrapping chain. But, maybe > i'm totally wrong. > > The basic idea in the next source fragment (from the Stackoverflow) is > partially OK. > > around 'run' => sub { > my $orig = shift; > my $self = shift; > > my %params = @_; > while (my ($key, $value) = each(%params)) { > $value = decode_utf8($value); > } > $self->$orig(%params); > } > > But, it is probably taken from Mason1 solution. For Mason2 it is > needed to be rewrited for Plack's Hash::Multivalue. In the current > form it is (IMO) useless and wrong. (decoding only blessed refs), not > the keys & values... > In short - he plugin should decode bytes to characters: > http://example.com/index?ááá=ééé&other=úúú regardless how them coming > (GET/POST) > I had a closer look into this issue some weeks ago because I could not pass utf8 encoded input fields to my app while porting from Mason1 to Mason2.
As far as I remember Plack::Request uses HTTP::Request which is based on HTTP::Message. Plack::Request drops the information about the charset provided in the message and passes the values as Hash::Multivalue using bytes. From my point of view the damage is done in Plack::Request and HTML::Mason has to deal with the bytes not knowing about the HTTP-Headers sent before, which is not possible. There are several options for the charset and how to interpret the bytes. As far as I remember HTML 4.0 recommends UTF8 encoding for URLs in %-notation. I am not sure if this is only for the path but for the GET parameters as well. In POST-Requests the Content-Type has to define the charset if not iso-8859-1, so several defaults are around. Looks like the first sections provide a good overview on the problem: http://wiki.apache.org/tomcat/FAQ/CharacterEncoding To cut and paste some nice unicode characters you can use http://www.decodeunicode.org/ ☺ Regards, Oliver Paukstadt -- Oliver Paukstadt <pst...@sourcentral.org> ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Mason-users mailing list Mason-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mason-users