Jon,

I'm simply unable to help. My Moose knowledge is minimal, so I can't
figure out what need override, before, after and so on... When I see
the source, I can patch it.. ;)

For utf8 - I know what need to be done, (and it is easy in "classic"
perl), but haven't any idea how to do it in Mason2/Moose/Moose.

BTW, the StackOverflow answer (
http://stackoverflow.com/a/6807411/734304 ) is halfly wrong.

So, what should be done in the Mason::Plugin::UTF8

1.) The plugin should add into generated obj files:

use utf8;

the "use utf8;" ensure, than the generated obj-code is treated as utf8
source code. This is abosulute minimum. Without this any component
containing wide chars will NOT works. So, he UTF8 plugin should ensure
adding this as minimum. Regardless of the fact than here is the
_possibility_ currently to do this. The UTF8 plugin should add this
automatically.

2.)
Probably most of utf8-app developers want add the following too:
  use 5.012; #anything bellow is broken (use feature 'unicode_strings')
  use Encode qw(encode decode);
  use Unicode::Normalize qw(NFD NFC);
  use open(:utf8);

. This is already can be done with:
   override 'output_class_header - in Mason::Compilation, or
   entering them into app/lib/App/Mason/Compilation.pm

SO, the above not need be default in the plugin, because developers
may want adding another things too. This step is DONE already. :)

3.)
everthing, what going from Mason to Plack should be encoded. wide->bytes
So, probably this is OK (from the Stackoveflow).

after 'process_output' => sub {
    my ($self, $outref) = @_;
    $$outref = encode_utf8( $$outref );
};

Saying PROBABLY, because I havent any idea why helps "doing something
after" a module what is outputting nothing by default... ;( This is
from the Mason source:

   method process_output ($outref) {
       # No-op by default
   }

The method does NOT output anything. So, encoding after the nothing -
will output something... It is wonderful - but unfortunately, i'm need
much much more learning Moose.

For me (yet) understand the Masons sources is as hopeless as my effort
to understand the "calling the rain" ritual dance in the central
african desert tribe.


4.)
everything what is coming from Pack (bytes) should be decoded into
internal unicode-characters when entering to Mason. This is in the
Stackoveflow's answer totally wrong (stated as coming from you, Jon).
:)

Wrong, because the %params are Hash::Multivalue and not a simple
%params as in the good-old Mason1. Unfortunately I havnen't any idea
whrere is the best place decode every element form the query.

Reading Mason manual giving me an idea than that should be done
somewhere in the "render", in the content wrapping chain. But, maybe
i'm totally wrong.

The basic idea in the next source fragment (from the Stackoverflow) is
partially OK.

around 'run' => sub {
    my $orig = shift;
    my $self = shift;

    my %params = @_;
    while (my ($key, $value) = each(%params)) {
        $value = decode_utf8($value);
    }
    $self->$orig(%params);
}

But, it is probably taken from Mason1 solution. For Mason2 it is
needed to be rewrited for Plack's Hash::Multivalue. In the current
form it is (IMO) useless and wrong. (decoding only blessed refs), not
the keys & values...
In short - he plugin should decode bytes to characters:
http://example.com/index?ááá=ééé&other=úúú regardless how them coming
(GET/POST)


5.)
Poet/Mason should use only UTF8 safe CPAN modules, or getting into
"duality" trouble.

One example for the current Mason::Filters. The HTML::FillInForm is
NOT utf8 safe. (here is an UTF8 version too:
HTML::FillInForm::ForceUTF8)

So, would be nice have a FillInFormForceUTF8 filter, what will use the
 HTML::FillInForm::ForceUTF8 as is done in Catalyst:
https://metacpan.org/module/Catalyst::Plugin::FillInForm::ForceUTF8

So in short:
If someone is able write an UTF8 __PLUGIN__ for Mason2, need to do 5 things:

1.) add "use utf8;" into every obj. file (this should be done by
plugin regardless of the next)
2.) allow adding additional pradmas into the obj source - this is done already!
3.) encode everything what going from Mason ---to---> Plack
4.) decode everything what coming from Pack --to--> Mason
5.) Add UTF8 safe FillInForm into Filters (and check other filters)

The above should be enough for the start. Users will comment and patch
it, for more.
e.g. _maybe_ setting the Content-type to UTF-8. (this can be done
manually, no need to add it to the 1st plugin version). and so on...

Anything, what is "hidden" from the user, so, when it is in the Mason
or Poet sources itself and not in the "plain components" should works
with utf8 characters and not bytes when the use install the UTF8 Mason
plugin. Happily all components are de-facto perl-sources, so the above
5 things will cover 99% needs (IMO).

And finally, somewhere in the doc, should be stated than component
names must be [a-zA-Z0-9_], because Moose will complain for invalid
method name...

So, it is impossible directly map utf8 url's to components - the only
way is using the dhandler and indirect processing).

e.g. when you want have something like:
http://cs.wikipedia.org/wiki/Česká_Wikipedie, you can't make a
component "Česká_Wikipedie.mc", because of perl limitations.

As I already said - unfortunately, i can help only this way.

Thanx,
ak.


On Tue, Jun 26, 2012 at 7:52 AM, Jonathan Swartz <swa...@pobox.com> wrote:
> This is clearly important to a lot of people, so it ought to be a plugin (at 
> least).
>
> Sadly I work in a utf8-ignorant world so I'll need help. Kobame, would you 
> help me come up with an initial plugin and I'll throw it onto CPAN? I noticed 
> the fragments of one here:
>
>    http://stackoverflow.com/questions/5858596/how-to-make-mason2-utf-8-clean
>
> but not sure if the code there represents your current thinking.
>
> Thanks,
> Jon

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Mason-users mailing list
Mason-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mason-users

Reply via email to