On Tue, Aug 10, 2010 at 12:44 AM, Johan Vromans <[email protected]>wrote:
> Bill Moseley <[email protected]> writes: > > > I'm curious about that. What are the portability issues? Are you > > rendering to browsers that do not support, say, utf8? > > No, it's a server problem, depending on version and configuration > options. See e.g. the Apache AddDefaultCharset config option. As a > result, HTML served may be provided with a 'charset=iso-8859.1' or > 'charset=utf-8'. Or none. Sometimes the charset <meta> tag is obeyed, > sometimes is is not. > It's a server configuration error. If you are serving static files make sure they are encoded correctly and then set AddDefaultCharset to match. That can be done in .htaccess if you don't have access to the server config. For dynamic content set a Content-Type header with charset. If you have characters in Perl and then you send them over the wire as octets then they have to be encoded into something, right? And if you encode you must say what the charset is or else the octets are just a string of bits to the client. In other words, your documents are encoded in something, so you need to get the web server to tell the client what that encoding is. Anyway, I'm wondering if the template is the correct place to do what you asking. It makes sense to "escape" < and > in the templates as they have special meaning, but seems like you really want to *encode* the entire HTML response content into a given charset (which you should always do anyway). So, after calling process() you then Encode into the encoding you want to send and agrees with what the web server is saying. Encode will even do your entities, if you really want to encode to ASCII: $ echo "hello is привет" | perl -MEncode -lne 'print Encode::encode( "ascii", Encode::decode_utf8($_), Encode::HTMLCREF )' hello is привет But, again, the client needs to know what encoding your content is encoded in, so might as well encode to utf8 and just tell the client it's utf8 instead of ascii. echo "hello is привет" | perl -MEncode -lne 'print Encode::encode( "utf8", Encode::decode_utf8($_), Encode::HTMLCREF )' hello is привет RedHat started adding 'AddDefaultCharset UTF-8' a couple of years ago to > the distributed server configs. Not funny. > That seems like a reasonable default. If files are ASCII on disk then they are fine. And utf8 would be a good bet otherwise, as the locale was probably utf8, too. > > Therefore I adapted the habit to always use &entities; for anything > non-ASCII. > Seems so last century. ;) -- Bill Moseley [email protected]
_______________________________________________ templates mailing list [email protected] http://mail.template-toolkit.org/mailman/listinfo/templates
