Volker Grabsch a écrit : > Bruno Desthuilliers <[EMAIL PROTECTED]> schrieb: > >>Carl Banks wrote: >> >>>Bruno Desthuilliers wrote: >>> >>>I'm well aware of Python's semantics, and it's irrelvant to my >>>argument. > > [...] > >>>If the language >>>were designed differently, then the rules would be different. >> >>Totally true - and totally irrelevant IMHO. > > > I strongly advise not to treat each others thoughts as irrelevant. > Assuming the opposite is a base of every public dicussion forum.
"Irrelevant" may not be the best expression of my thought here - it's just that Carl's assertion is kind of a tautology and doesn't add anything to the discussion. If Python had been designed as statically typed (with declarative typing), the rules would be different. Yeah, great. And now ? > I assume here is a flaw in Python. To explain this, I'd like to > make Bruno's point Actually Carl's point, not mine. > clearer. As usually, code tells more then > thousand words (an vice versa :-)). > > Suppose you have two functions which somehow depend on the emptyness > of a sequence. This is a stupid example, but it demonstrates at > least the two proposed programming styles: > > ------------------------------------------------------ > >>>>def test1(x): > > ... if x: > ... print "Non-Empty" > ... else: > ... print "Empty" > ... > >>>>def test2(x): > > ... if len(x) > 0: > ... print "Non-Empty" > ... else: > ... print "Empty" > ------------------------------------------------------ > > Bruno Carl > pointed out a subtle difference in the behaviour of those > functions: > > ------------------------------------------------------ > >>>>a = [] >>>>test1(a) > > Empty > >>>>test1(iter(a)) > > Non-Empty > >>>>test2(a) > > Empty > >>>>test2(iter(a)) > > Traceback (most recent call last): > File "<stdin>", line 1, in ? > File "<stdin>", line 2, in test2 > TypeError: len() of unsized object > ------------------------------------------------------ > > > While test1() returns a wrong/random result when called with an > iterator, the test2() function breaks when beeing called wrongly. Have you tried these functions with a numpy array ? > So if you accidently call test1() with an iterator, the program > will do something unintended, and the source of that bug will be > hard to find. So Bruno is IMHO right in calling that the source > of a suptle bug. Actually it's Carl who makes that point - MHO being that it's a programmer error to call a function with a param of the wrong type. > However, if you call test2() with an iterator, the program will > cleanly break early enough with an exception. That is generally > wanted in Python. You can see this all over the language, e.g. > with dictionaries: > > ------------------------------------------------------ > >>>>d = { 'one': 1 } >>>>print d['one'] > > 1 > >>>>print d['two'] > > Traceback (most recent call last): > File "<stdin>", line 1, in ? > KeyError: 'two' > ------------------------------------------------------ > > Python could have been designed to return None when d['two'] has been > called, as some other (bad) programming languages would. This would > mean that the problem will occur later in the program, making it easy > to produce a subtle bug. It would be some effort to figure out the > real cause, i.e. that d had no entry for 'two'. I don't think the comparison is right. The equivalent situation would be to have a function trying to access d['two'] on a dict-like type that would return a default value instead of raising a KeyError. > Luckily, Python throws an exception (KeyError) just at the original > place where the initial mistake occured. If you *want* to get None in > case of a missing key, you'll have to say this explicitly: > > ------------------------------------------------------ > >>>>print d.get('two', None) > > None > ------------------------------------------------------ > > So maybe "bool()" should also break with an exception if an object > has neither a __nonzero__ nor a __len__ method, instead of defaulting > to True. FWIW, Carl's main example is with numpy arrays, that have *both* methods - __nonzero__ raising an expression. > Or a more strict variant of bool() called nonempty() should > exist. > > Iterators don't have a meaningful Boolean representation, > because > phrases like "is zero" or "is empty" don't make sense for them. If so, almost no type actually has a "meaningfull" boolean value. I'd rather say that iterators being unsized, the mere concept of an "empty" iterator has no meaning. > So > instead of answering "false", an iterator should throw an exception > when beeing asked whether he's empty. > If a function expects an object to have a certain protocol (e.g. > sequence), and the given object doesn't support that protocol, > an exception should be raised. So you advocate static typing ? Note that numpy arrays actually have both __len__ and __nonzero__ defined, the second being defined to forgive boolean coercion... > This usually happens automatically > when the function calls a non-existing method, and it plays very > well with duck typing. > > test2() behaves that way, but test1() doesn't. The reason is a > sluttery of Python. Python should handle that problem as strict > as it handles a missing key in a dictionary. Unfortunately, it > doesn't. Then proceed to write a PEP proposing that evaluating the truth value of an iterator would raise a TypeError. Just like numpy arrays do - as a decision of it's authors. > I don't agree with Bruno s/bruno/Carl/ > that it's more natural to write > if len(a) > 0: > ... > instead of > if a: > ... > > But I think that this is a necessary kludge you need to write > clean code. Otherwise you risk to create subtle bugs. s/you risk to create/careless programmers will have to face/ And FWIW, this is clearly not the opinion of numpy authors, who state that having len > 0 doesn't means the array is "not empty"... > This advise, > however, only applies when your function wants a sequence, because > only in that can expect "len(a)" to work. Since sequence types are defined as having a False value when empty, this test is redondant *and* "will create subtle bugs" when applied to a numpy array. > I also agree with Carl that "if len(a) > 0" is less universal than > "if a", because the latter also works with container-like objects > that have a concept of emptiness, s/emptiness/boolean value/ > but not of length. > However, this case is less likely to happen than shooting yourself > in the foot by passing accidently an iterator to the function > without getting an exception. I think, this flaw in Python is deep > enough to justify the "len() > 0" kludge. It surely justify some thinking on the boolean value of iterators. Since the common idiom for testing non-None objects is an explicit identity test against None - which makes sens since empty sequences and zero numerics eval to False in a boolean context - the less inappropriate solution would be to have iterators implementing __nonzero__ like numpy arrays do. > > IMHO, that flaw of Python should be documented in a PEP as it violates > Python's priciple of beeing explicit. Here again, while I agree that there's room for improvement, I don't agree on this behaviour being a "flaw" - "minor wart" would better describe the situation IMHO. -- http://mail.python.org/mailman/listinfo/python-list