Le 11/07/12 06:32, Philippe Verdy a écrit :
2012/7/10 Naena Guru <[email protected] <mailto:[email protected]>>
I wanted to see how hard it is to edit a page in Notepad. So I
made a copy of my LIYANNA page and replaced the character entities
I used for Unicode Sinhala, accented Pali and Sanskrit with their
raw letters. Notepad forced me to save the file in UTF-8 format. I
ran it through W3C Validator. It passed HTML5 test with the
following warning:
#
Warning Byte-Order Mark found in UTF-8 File.
#
The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is
known to cause problems for some text editors and older
browsers. You may want to consider avoiding its use until it
is better supported.
The BOM is the first character of the file. There are myriad hoops
that non-Latin users go through to do things that we routinely do.
This problem I saw right at the inception. I already know why
romanizing is so good. Don't you?
You should probably ignore this non-critical warning now ; it is only
for extremely strict compatibility with deprecated softwares that
should have been updated since long for obvious security and
performance reasons.
There are a few cases where a BOM may cause troubles.
For example, there's a PHP function header() which permits to redirect
to another page.
If you create a PHP document containing
?<?php
header("location:http://unicode.org");
?>
you will be redirected to the Unicode website.
No text may be sent before the header() function. Otherwise you get an
error message.
If your document contains
text
?<?php
header("location:http://unicode.org");
?>
"text" will be sent and you'll get an error message such as:
text ? Warning: Cannot modify header information - headers already sent
by (output started at /customers/0/1/f/colson.eu/httpd.www/test.php:2)
in /customers/0/1/f/colson.eu/httpd.www/test.php on line 3
If your document only contains
?<?php
header("location:http://unicode.org");
?>
but you save it with a BOM, the BOM will be sent and you'll get an error
message like
Warning: Cannot modify header information - headers already sent by
(output started at /customers/0/1/f/colson.eu/httpd.www/test.php:1) in
/customers/0/1/f/colson.eu/httpd.www/test.php on line 2
(tested with Firefox 13.0 and Google Chrome 20.0.1132.47 on Ubuntu 12.04.)
On Windows, Notepad++ <http://notepad-plus-plus.org/> permits to choose
among different encodings including UTF-8 and UTF-8 without BOM. Always
choose the last one and you'll avoid such problems.