Tony Mechelynck 写道: > As for using UTF-8 with BOM, > I have no statistics on it about what other people do, but I found it to > be (as the FAQ quoted above said) an excellent signature to mean that a > file is in UTF-8. This ought not to conflict with shell scripts, which > cannot have any BOM but are (normally) in 7-bit ASCII.
utf-8 comes after ucs2, so there should be a good reason there must be a new encoding. by design, utf-8 should overcome some problems: 1. Zero-terminate string compatible: no zero character '\0' in the middle of string, (ucs2 normally have '\0' character inside the string which will break many existing functions) 2. Unix pipe compatible: a file can break into several files, some files can concatenate together to form a new file. (if file contains BOM, then when it break into two file the second file does not contain BOM. if two files contain BOM, then when they concatenate a new file the new file contains BOM inside its content. ) the BOM makes it impossible to handle text stream properly, hence you should *not* use BOM in unix-alike systems. If you use Linux and insist BOM in utf-8, you'll eventually hit the wall. --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_use" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---
