All my upload forms have accept-charset="utf-8". We expect that uploaded filenames could have wide-characters.
The problem I hit was ->basename does this: $ perl -le 'use Catalyst::Request::Upload; my $upload = Catalyst::Request::Upload->new( { filename => q[документ обучения.pdf] } ); print $upload->basename;' _.pdf That's pretty mangled. The problem is that $upload->filename is not decoded so the substitution is working on octets not characters. sub _build_basename { my $self = shift; my $basename = $self->filename; $basename =~ s|\\|/|g; $basename = ( File::Spec::Unix->splitpath($basename) )[2]; $basename =~ s|[^\w\.-]+|_|g; return $basename; } Obviously, we want \w to work on characters, not encoded octets. Decoding the filename should be done -- it's character data. Does it make sense to do it in Engine's prepare_uploads? For example: my $u = Catalyst::Request::Upload->new( size => $upload->{size}, type => scalar $headers->content_type, headers => $headers, tempname => $upload->{tempname}, filename => *$c->_handle_unicode_decoding($upload->{filename})*, ); -- Bill Moseley mose...@hank.org
_______________________________________________ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/