[issue40416] Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded file causes UnicodeDecodeError

2020-05-02 Thread Rob Malouf
Rob Malouf added the comment: Same results on MacOS 10.15.4 (both the system python and the intel/anaconda version) and on CentOS 7.8 Here's the output with print(...): 13 71 72 392 393 399 536 537 761 762 879 880 933 934 1146 1147 1254 1255 1359 1360 1760 1761 1772 1895 1897 1906 2105

[issue40416] Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded file causes UnicodeDecodeError

2020-04-27 Thread Rob Malouf
New submission from Rob Malouf : Calling TextIOWrapper.tell() while reading the attached gb2312-encoded file like this: with open('udhr-gb2312.txt', encoding='GB2312') as f: while True: line = f.readline() t = f.tell() if not line:

[issue28340] [py2] TextIOWrapper.tell extremely slow

2017-05-22 Thread Rob Malouf
Changes by Rob Malouf : -- pull_requests: +1832 ___ Python tracker <http://bugs.python.org/issue28340> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue28340] TextIOWrapper.tell extremely slow

2016-10-02 Thread Rob Malouf
New submission from Rob Malouf: io.TextIOWrapper.tell() is unusably slow in Python 2.7. This same problem was introduced in Python 3 and fixed in Python 3.3 (see Issue # 4). Any chance of getting the fix backported into the Python 2.7 library? It would make it much easier to modernize

[issue25535] collections.Counter methods return Counter objects

2015-11-02 Thread Rob Malouf
New submission from Rob Malouf: Several collections.Counter methods return Counter objects, which is leads to wrong or at least confusing behavior when Counter is subclassed. For example, nltk.FreqDist is a subclass of Counter: >>> x = nltk.FreqDist(['a',