On 9/29/07, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > At 07:33 AM 9/29/2007 -0700, Guido van Rossum wrote: > >Until just before 3.0a1, they were unequal. We decided to raise > >TypeError because we noticed many bugs in code that was doing things > >like > > > > data = f.read(4096) > > if data == "": break > > Thought experiment: what if read() always returned strings, and to > read bytes, you had to use something like 'f.readinto(ob, 4096)', > where 'ob' is a mutable bytes instance or memory view? > > In Python 2.x, there's only one read() method because (prior to > unicode), there was only one type of reading to do. > > But as the above example makes clear, in 3.x you simply *can't* write > code that works correctly with an arbitrary file that might be binary > or text, at least not without typechecking the return value from > read(). (In which case, you might as well inspect the file > object.) So, the above problem could be fixed by having .read() > raise an error (or simply not exist) on a binary file object.
Perhaps write if len(data) == 0: break since that's what you really mean. Any other code that compares the result of read() to either a bytes or a str really is taking a text or binary file object specifically and not working on an arbitrary file. > In this way, the problem is fixed at the point where it really > occurs: i.e., at the point of not having decided whether the stream > is bytes or text. > > This also seems to fit better (IMO) with the best practice of > enforcing str/unicode/encoding distinctions at the point where data > enters the program, rather than delaying the error to later. > > > >I thought about using warning too, but since nobody wants warnings, > >that would be pretty much the same as raising TypeError except for the > >most dedicated individuals (and if I were really dedicated I'd just > >write my own eq() function anyway). > > The use case I'm concerned about is code that's not type-specific > getting a TypeError by comparing arbitrary objects. For example, if > you write Python code to create a Python code object (e.g. the > compiler package or my own BytecodeAssembler), you need to create a > list of constants as you generate the code, and you need to be able > to search the list for an equal constant. Since strings and bytes > can both be constants, a simple list.index() test could now raise a > TypeError, as could "item in list". > > So raising an error to make bad code fail sooner, will also take down > unsuspecting code that isn't really broken, and *force* the writing > of special comparison code -- which won't be usable with things like > list.remove and the "in" operator. > > In comparison, forcing code to be bytes vs. text aware at the point > of I/O directs attention to the place where you can best decide what > to do about it. (After all, the comparison that raises the TypeError > might occur deep in a library that's expecting to work with text.) > > > >And the warning would do nothing > >about the issue brought up by Jim Jewett, the unpredictable behavior > >of a dict with both bytes and strings as keys. > > I've looked at all of Jim's messages for September, but I don't see > this. I do see where raising TypeError for comparisons causes a > problem with dictionaries, but I don't see how an unequal comparison > creates "unpredictable" behavior (as opposed to predictable failure to match). > > _______________________________________________ > Python-3000 mailing list > Python-3000@python.org > http://mail.python.org/mailman/listinfo/python-3000 > Unsubscribe: > http://mail.python.org/mailman/options/python-3000/jyasskin%40gmail.com > -- Namasté, Jeffrey Yasskin http://jeffrey.yasskin.info/ "Religion is an improper response to the Divine." — "Skinny Legs and All", by Tom Robbins _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com