This is to propose for inclusion in HTML::Template a patch that
adds unicode support.

The 2.8 release of Html::Template opens templates as raw files.
That means every byte is interpreted as an individual character.
If a parameter contains wide characters (katakana, or accented latin
characters for example), then the bytes from the templates are made to
match the wide characters by translating the bytes to Unicode.  This is
done by interpreting the bytes as Latin-1 characters.

If the template file happens to contain Unicode already, this breaks:
the bytes making up an UTF-8 character are fed to the Latin => unicode
transformation, and you end up with characters that are encoded twice.

There are some ways to handle this situation:
    **  demand that parameters supplied to template processing
        don't contain wide characters.  All parameters must have
        been processed by Encode::encode before template expansion.
        This is inconvenient, especially if the parameters are used
        in more than one place.

    **  Supply a filter subroutine to the template that will do UTF
        decoding after the template file has been read, as follows:

            my $tmpl = HTML::Template->new (filename => 'test.tmpl',
                        filter => sub {
                            my $ref = shift;
                            ${$ref} = Encode::decode_utf8(${$ref});
                        });

        This works, but is a bit ad-hoc: it was not immediately obvious
        to me that this filter is an opportunity to make Unicode work.

    **  Add a feature to HTML::Template to specify the encoding
        of template files.

I cooked up a patch that adopts the latter approach by adding an optional
"encoding" argument to the Template->new() function, like so:

    my $t = HTML::Template->new(
        filename => 'file.tmpl',
        encoding => ':encoding(UTF-8)');

The specified encoding is used not only for the template itself,
but also for any templates included from within the template.
Possible values for encoding are defined in perlio(3perl).
This also works fine with templates encoded in character
sets other than unicode or latin1.

The attached patch was made against 2.8, but applies to 2.9 with 
a small offset.  For now, a larger version is at:
http://www.xs4all.nl/~ekonijn/html-template-unicode.patch
(The larger version contains tests with a number of non-ascii characters,
so is tricky to send reliably over a mailing list)

Regards,
Erik



diff -urN org/libhtml-template-perl-2.8/Template.pm 
new/libhtml-template-perl-2.8/Template.pm
--- org/libhtml-template-perl-2.8/Template.pm   2007-07-05 21:40:40.000000000 
+0200
+++ new/libhtml-template-perl-2.8/Template.pm   2007-07-09 17:27:11.000000000 
+0200
@@ -885,6 +885,15 @@
 HTML::Template will apply the specified escaping to all variables
 unless they declare a different escape in the template.
 
+=item * 
+
+encoding - Set this to the name of a perlio layer to be used when
+doing open() on the template or an included template; default is ":bytes".
+As an example, to read a template containing unicode:
+
+   my $template = HTML::Template->new(filename => 'zap.tmpl',
+                                      encoding => ':utf8');
+
 =back
 
 =back 4
@@ -949,6 +958,7 @@
                vanguard_compatibility_mode => 0,
                associate => [],
                path => [],
+              encoding => ':bytes',
                strict => 1,
                loop_context_vars => 0,
                max_includes => 10,
@@ -1635,8 +1645,9 @@
       $options->{filepath} = $filepath;   
     }
 
+    my $encoding = $options->{encoding};
     confess("HTML::Template->new() : Cannot open included file 
$options->{filename} : $!")
-        unless defined(open(TEMPLATE, $filepath));
+        unless defined(open(TEMPLATE, "<$encoding", $filepath));
     $self->{mtime} = $self->_mtime($filepath);
 
     # read into scalar, note the mtime for the record
@@ -2240,8 +2251,9 @@
        }
        die "HTML::Template->new() : Cannot open included file $filename : file 
not found."
          unless defined($filepath);
+       my $encoding = $options->{encoding};
        die "HTML::Template->new() : Cannot open included file $filename : $!"
-         unless defined(open(TEMPLATE, $filepath));              
+         unless defined(open(TEMPLATE, "<$encoding", $filepath));              
        
        # read into the array
        my $included_template = "";


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Html-template-users mailing list
Html-template-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/html-template-users

Reply via email to