Sorry, I'm newbie in python. I can't help you further, indeed I don't know either.:)2005/12/23, David Xiao [EMAIL PROTECTED]:
Hi Kuan:Thanks a lot! One more question here: How to write if I want tospecify locale other than current locale?For example, running on Korea locale system, and try read a
FYI. I had just receive something from a friend, he give me following
nice example!
I have one more question on this: How to write if I want to specify
locale other than current locale? For example, program runn on Korea
locale system, and try reading a UTF-8 file that save chinese
characters.
2005/12/23, David Xiao [EMAIL PROTECTED]:
Hi Kuan:
Thanks a lot! One more question here: How to write if I want
to
specify locale other than current locale?
For example, running on Korea locale system, and try read a
UTF-8
UTF-8 shouldn't need a BOM, as it is designed for character streams, and
there is only one logical ordering of the bytes. Only UTF-16 and greater
should output a BOM, AFAIK.
--
http://mail.python.org/mailman/listinfo/python-list
John Bauman wrote:
UTF-8 shouldn't need a BOM, as it is designed for character streams, and
there is only one logical ordering of the bytes. Only UTF-16 and greater
should output a BOM, AFAIK.
However there's a pending patch (http://bugs.python.org/1177307) for a
new encoding named
John Bauman wrote:
UTF-8 shouldn't need a BOM, as it is designed for character streams, and
there is only one logical ordering of the bytes. Only UTF-16 and greater
should output a BOM, AFAIK.
Yes and no. Yes, UTF-8 does not need a BOM to identify endianness. No,
usage of the BOM with UTF-8
Hi Friends:
fileObj = codecs.open( filename, r, utf-8 )
u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in
the file
print u
It says error:
UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
position 0:
illegal multibyte
import codecsdef read_utf8_txt_file (filename): fileObj = codecs.open( filename, r, utf-8 ) content = fileObj.read() content = content[1:] #exclude BOM
print content
fileObj.close() read_utf8_txt_file(e:\\u.txt)22 Dec 2005 18:12:28 -0800, [EMAIL PROTECTED]
[EMAIL PROTECTED]:Hi Friends:fileObj =