I am getting the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15: invalid
start byte
as I try to read some files through TaggedCorpusReader. TaggedCorpusReader is a
module
of NLTK.
My files are saved in ANSI format in MS-Windows default.
I am using Python2.7 on MS-Windows 7.
I have tried the following options till now,
string.encode('utf-8').strip()
unicode(string)
unicode(str, errors='replace')
unicode(str, errors='ignore')
string.decode('cp1252')
But nothing is of much help.
If any one may kindly suggest.
I am trying if you may see.
--
https://mail.python.org/mailman/listinfo/python-list