Hello
using current cvs Python on Linux, I observe this weird
behavior of the readline() method on file-like objects
returned from the codecs module:

[EMAIL PROTECTED] ypage]$ cat testfile1.txt
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
offending line: ladfj askldfj klasdj fskla dfzaskdj fasklfj laskd fjasklfzzzzaa%whereisthis!!!
next line.
[EMAIL PROTECTED] ypage]$ cat testfile2.txt
aaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbb
stillokay:bbbbxx
broken!!!!badbad
againokay.
[EMAIL PROTECTED] ypage]$ cat bug.py
import codecs
for name in ("testfile1.txt","testfile2.txt"):
f=codecs.open(name,encoding="iso-8859-1") # precise encoding doesn't matter
print "----",name,"----"
for line in f:
print "LINE:"+repr(line)
[EMAIL PROTECTED] ypage]$ python25 bug.py
---- testfile1.txt ----
LINE:u'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy\r\n'
LINE:u'offendi'
LINE:u'ng line: ladfj askldfj klasdj fskla dfzaskdj fasklfj laskd fjasklfzzzzaa'
LINE:u'%whereisthis!!!\r\n'
LINE:u'next line.\r\n'
---- testfile2.txt ----
LINE:u'aaaaaaaaaaaaaaaaaaaaaaaa\n'
LINE:u'bbbbbbbbbbbbbbbbbbbbbbbb\n'
LINE:u'stillokay:bbbbxx\n'
LINE:u'broke'
LINE:u'n!!!!badbad\n'
LINE:u'againokay.\n'
[EMAIL PROTECTED] ypage]$



See how it breaks certain lines in half? It only happens when a certain encoding is used, so regular file objects behave as expected. Also, readlines() works fine.

Python 2.3.4 and Python 2.4 do not have this problem.

Am I missing something or is this a bug? Thanks!

--Irmen
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to