Tony Mechelynck 写道:
> As for using UTF-8 with BOM, 
> I have no statistics on it about what other people do, but I found it to 
> be (as the FAQ quoted above said) an excellent signature to mean that a 
> file is in UTF-8. This ought not to conflict with shell scripts, which 
> cannot have any BOM but are (normally) in 7-bit ASCII.

utf-8 comes after ucs2, so there should be a good reason there must be a 
new encoding. by design, utf-8 should overcome some problems:

1. Zero-terminate string compatible: no zero character '\0' in the 
middle of string, (ucs2 normally have '\0' character inside the string 
which will break many existing functions)

2. Unix pipe compatible: a file can break into several files, some files 
can concatenate together to form a new file. (if file contains BOM, then 
when it break into two file the second file does not contain BOM. if two 
files contain BOM, then when they concatenate a new file the new file 
contains BOM inside its content. ) the BOM makes it impossible to handle 
text stream properly, hence you should *not* use BOM in unix-alike systems.


If you use Linux and insist BOM in utf-8, you'll eventually hit the wall.



--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply via email to