Re: print UTF-8 file with BOM

2005-12-23 Thread Kevin Yuan
Sorry, I'm newbie in python. I can't help you further, indeed I don't know either.:)2005/12/23, David Xiao [EMAIL PROTECTED]: Hi Kuan:Thanks a lot! One more question here: How to write if I want tospecify locale other than current locale?For example, running on Korea locale system, and try read a

Re: print UTF-8 file with BOM

2005-12-23 Thread davihigh
FYI. I had just receive something from a friend, he give me following nice example! I have one more question on this: How to write if I want to specify locale other than current locale? For example, program runn on Korea locale system, and try reading a UTF-8 file that save chinese characters.

Re: print UTF-8 file with BOM

2005-12-23 Thread Carsten Haese
2005/12/23, David Xiao [EMAIL PROTECTED]: Hi Kuan: Thanks a lot! One more question here: How to write if I want to specify locale other than current locale? For example, running on Korea locale system, and try read a UTF-8

Re: print UTF-8 file with BOM

2005-12-23 Thread John Bauman
UTF-8 shouldn't need a BOM, as it is designed for character streams, and there is only one logical ordering of the bytes. Only UTF-16 and greater should output a BOM, AFAIK. -- http://mail.python.org/mailman/listinfo/python-list

Re: print UTF-8 file with BOM

2005-12-23 Thread Walter Dörwald
John Bauman wrote: UTF-8 shouldn't need a BOM, as it is designed for character streams, and there is only one logical ordering of the bytes. Only UTF-16 and greater should output a BOM, AFAIK. However there's a pending patch (http://bugs.python.org/1177307) for a new encoding named

Re: print UTF-8 file with BOM

2005-12-23 Thread Martin v. Löwis
John Bauman wrote: UTF-8 shouldn't need a BOM, as it is designed for character streams, and there is only one logical ordering of the bytes. Only UTF-16 and greater should output a BOM, AFAIK. Yes and no. Yes, UTF-8 does not need a BOM to identify endianness. No, usage of the BOM with UTF-8

print UTF-8 file with BOM

2005-12-22 Thread davihigh
Hi Friends: fileObj = codecs.open( filename, r, utf-8 ) u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in the file print u It says error: UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in position 0: illegal multibyte

Re: print UTF-8 file with BOM

2005-12-22 Thread Kevin Yuan
import codecsdef read_utf8_txt_file (filename): fileObj = codecs.open( filename, r, utf-8 ) content = fileObj.read() content = content[1:] #exclude BOM print content fileObj.close() read_utf8_txt_file(e:\\u.txt)22 Dec 2005 18:12:28 -0800, [EMAIL PROTECTED] [EMAIL PROTECTED]:Hi Friends:fileObj =