2005/12/23, David Xiao <[EMAIL PROTECTED]>:
Hi Kuan:
Thanks a lot! One more question here: How to write if I want to
specify locale other than current locale?
For example, running on Korea locale system, and try read a UTF-8 file
that save chinese.
Regards, David
2005/12/23, Kevin Yuan <[EMAIL PROTECTED]>:
> import codecs
> def read_utf8_txt_file (filename):
> fileObj = codecs.open ( filename, "r", "utf-8" )
> content = fileObj.read()
> content = content[1:] #exclude BOM
> print content
> fileObj.close()
>
> read_utf8_txt_file("e:\\u.txt")
>
> 22 Dec 2005 18:12:28 -0800, [EMAIL PROTECTED] < [EMAIL PROTECTED]>:
> > Hi Friends:
> >
> > fileObj = codecs.open( filename, "r", "utf-8" )
> > u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes
> in
> > the file
> > print u
> >
> > It says error:
> > UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff'
> in
> > position 0:
> > illegal multibyte sequence
> >
> > I want to know how read from UTF-8 file, and convert to specified
> > locale (default is current system locale) and print out string. I hope
> > put away BOM header automatically.
> >
> > Rgds, David
> >
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
>
>
-- http://mail.python.org/mailman/listinfo/python-list