New submission from Rob Malouf <rmal...@mail.sdsu.edu>:

Calling TextIOWrapper.tell() while reading the attached gb2312-encoded file 
like this:

with open('udhr-gb2312.txt', encoding='GB2312') as f: 
    while True: 
       line = f.readline() 
       t = f.tell()
       if not line: 
           break 

gives this result:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    t = f.tell()
UnicodeDecodeError: 'gb2312' codec can't decode byte 0xb5 in position 0: 
illegal multibyte sequence

The file seems to be well-formed and can be read without any problem.  It's 
only the call to tell() that raises an issue.

----------
components: IO, Unicode
files: udhr-gb2312.txt
messages: 367494
nosy: ezio.melotti, rmalouf, vstinner
priority: normal
severity: normal
status: open
title: Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded 
file causes UnicodeDecodeError
type: crash
versions: Python 3.7
Added file: https://bugs.python.org/file49096/udhr-gb2312.txt

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40416>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to