Le 11/07/12 06:32, Philippe Verdy a écrit :
2012/7/10 Naena Guru <[email protected] <mailto:[email protected]>>

    I wanted to see how hard it is to edit a page in Notepad. So I
    made a copy of my LIYANNA page and replaced the character entities
    I used for Unicode Sinhala, accented Pali and Sanskrit with their
    raw letters. Notepad forced me to save the file in UTF-8 format. I
    ran it through W3C Validator. It passed HTML5 test with the
    following warning:

     #

        Warning Byte-Order Mark found in UTF-8 File.

     #

        The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is
        known to cause problems for some text editors and older
        browsers. You may want to consider avoiding its use until it
        is better supported.

    The BOM is the first character of the file. There are myriad hoops
    that non-Latin users go through to do things that we routinely do.
    This problem I saw right at the inception. I already know why
    romanizing is so good. Don't you?


You should probably ignore this non-critical warning now ; it is only for extremely strict compatibility with deprecated softwares that should have been updated since long for obvious security and performance reasons.

There are a few cases where a BOM may cause troubles.

For example, there's a PHP function header() which permits to redirect to another page.

If you create a PHP document containing

?<?php
header("location:http://unicode.org";);
?>

you will be redirected to the Unicode website.

No text may be sent before the header() function. Otherwise you get an error message.
If your document contains

text
?<?php
header("location:http://unicode.org";);
?>

"text" will be sent and you'll get an error message such as:

text ? Warning: Cannot modify header information - headers already sent by (output started at /customers/0/1/f/colson.eu/httpd.www/test.php:2) in /customers/0/1/f/colson.eu/httpd.www/test.php on line 3

If your document only contains

?<?php
header("location:http://unicode.org";);
?>

but you save it with a BOM, the BOM will be sent and you'll get an error message like

Warning: Cannot modify header information - headers already sent by (output started at /customers/0/1/f/colson.eu/httpd.www/test.php:1) in /customers/0/1/f/colson.eu/httpd.www/test.php on line 2

(tested with Firefox 13.0 and Google Chrome 20.0.1132.47 on Ubuntu 12.04.)

On Windows, Notepad++ <http://notepad-plus-plus.org/> permits to choose among different encodings including UTF-8 and UTF-8 without BOM. Always choose the last one and you'll avoid such problems.

Reply via email to