Sam Tregar <[EMAIL PROTECTED]> writes: > As far as I know the character-set conversions are not necessary to > achieve this goal, so they weren't included.
You are correct. No general-purpose HTML quoting function handles internationalization, for two reasons: * It's not necessary to achieve the primary purpose of quoting, which is to prevent the HTML metacharacters to be interpreted as markup. * It's extremely hard to implement without making simplistic assumptions. Handling of I18N text is highly context-dependent. For example, it may seem "correct" to change the character 220 to "Ü". But if the target template is in a different charset, where 220 has a wholly different meaning? For example, in Latin 1, the character 169 is the copyright sign, with entities "©" and "©". But in a Latin 2 HTML document, exactly the same code represents the "S with caron" character, with entities "Š" and "Š". In UTF-8, the same code is an illegal character. How is a quoting function to know whether to convert code 169 to "©" or to "Š"? A quoting function that tried to fully handle I18N would have to know everything about charsets and HTML and the surrounding context. Doing that kind of work for no gain is pointless. Doing the simple thing and assuming Latin 1 is actually *harmful* for non-Latin 1 users. ------------------------------------------------------- This sf.net email is sponsored by: Access Your PC Securely with GoToMyPC. Try Free Now https://www.gotomypc.com/s/OSND/DD _______________________________________________ Html-template-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/html-template-users
