Re: [Mojolicious] PSGI and utf8 encoding

DryDuck Sun, 30 Aug 2015 11:09:40 -0700

You are unfortunately right. It is the rather silly behavior of 
Encode::Unicode that is to blame.
When Encode::Unicode encounters a BOM in a UTF-8 encoded file it leaves it 
alone, all other BOM types are removed, talk about inconsistency


Unfortunately I think that it will be very hard to get Encode::Unicode 
changed so that it also removes the BOM when working on UTF-8 encoded files.
Which leaves it up to Mojolicious to do the right thing, that is to strip 
the UTF-8 BOM from the templates when loading them.


/DryDuck

On Sunday, 30 August 2015 14:35:16 UTC+2, sri wrote:
>
> The unicode standard stats that BOMs should be stripped before any 
>> processing of a unicode string is performed.
>>
>> Quote from 
>> http://www.unicode.org/versions/Unicode8.0.0/UnicodeStandard-8.0.pdf 
>> page 834
>>
>> Systems that use the byte order mark must recognize when an initial 
>>> U+FEFF signals the
>>> byte order. In those cases, it is not part of the textual content and 
>>> should be removed before
>>> processing, because otherwise it may be mistaken for a legitimate zero 
>>> width no-break space.
>>
>>
>> That is why I think that there is a bug in Mojolicious. It does not strip 
>> the BOM from a template before it is processed.
>>
>
> This cannot be a bug in Mojolicious. Like almost everything else in the 
> Perl world, we do not deal with Unicode directly, and rely completely on 
> the Encode module.
>
> --
> sebastian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Mojolicious" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/mojolicious.
For more options, visit https://groups.google.com/d/optout.

Re: [Mojolicious] PSGI and utf8 encoding

Reply via email to