Chris Angelico wrote: > On Tue, Aug 2, 2011 at 9:23 PM, Terry Reedy <tjre...@udel.edu> wrote: >> On 8/2/2011 12:55 PM, Chris Angelico wrote: >>> >>> As I understand it, Python exclusively late-binds names; when you >>> define a function, nothing is ever pre-bound. >> >> By 'pre-bound' you presumably mean bound at definition time rather than >> call time. Default arg objects *are* pre-computed and pre-bound to >> internal slots at definition time. > > Of course; that's a different issue altogether. No, I'm talking about > the way a tight loop will involve repeated lookups for the same name.
It's not really a different issue. The "standard" approach for performing early binding in Python is by binding a global name to a local name at function definition time. In CPython at least, local lookups are faster than globals: locals are stored in a fixed array, and the function knows the numeric offset of each variable. Non-locals have to search multiple namespaces (globals + built-ins), using a hash table. That's fast, but locals are faster. (Aside: this is why locals() is not generally writable: the dict is a copy of that internal array. You can write to the dict, but the changes don't propagate back to the actual local table of the function.) So a good way to speed up name lookups is to turn a global into a local, and one common way to do that is to do it when the function is defined. For example, you will see this in the random module: def randrange(self, start, stop=None, step=1, int=int, default=None, maxwidth=1<<BPF): """Choose a random item from range(start, stop[, step]). This fixes the problem with randint() which includes the endpoint; in Python this is usually not what you want. Do not supply the 'int', 'default', and 'maxwidth' arguments. """ Note the int=int parameter. > Unfortunately, there is no way - by definition - to guarantee that a > binding won't change. Even in the example of getting the lengths of > lines in a file, it's entirely possible for __len__ to rebind the > global name "len" - so you can't rely on the repeated callings of > len() to be calling the same function. > > But who WOULD do that? It's somewhat ridiculous to consider, and > there's a huge amount of code out there that does these repeated calls > and does not do stupid rebindings in the middle. So Python permits > crazy behaviour at the cost of the performance of normal behaviour. This is a common criticism of Python, and the truth be told, it is right: Python makes crazy monkey-patching that we're told not to do easy, at the expense of allowing the compiler to optimise really common things. But, are you *sure* that name lookups are *really* the bottleneck? Name lookups are pretty fast. If you want them faster, turn them into a local variable. It's not clear to me that syntax, or a pragma directive, or some other form of magic, is better than an explicit assignment: def func(x): _len = len # bind at function call time for i in x: _len(i) # lookups of _len will be faster than len The argument that Python would be faster if it did early binding is not a given. For trivial code that doesn't do much, I dare say it might shave off a significant percentage, but trivial code is already pretty fast. Who cares if you speed up a trivial operation? In non-trivial code that does real work, it's not obvious that early binding will result in meaningful time savings. ("Yay, my code now runs in 17.1 milliseconds instead of 17.2!") The onus is on the person proposing this change to prove that the benefit is worth the extra complexity. Knowing what we know now, I'm not sure whether Guido would have kept the late binding semantics for Python, or privileged builtins in some fashion. I suspect that he wouldn't have changed a thing, but if he did, I suspect he'd be more concerned about accidental shadowing of builtins than about optimizations. E.g. people say list = [1,2,3], and then later try to call list(something). Or maybe that's just me :) Guido is generally very, very suspicious about proposed optimisations. They tend to increase the code's complexity by far more than they increase its speed. (E.g. an order of magnitude more complex to get 10% speed increase.) And they often subtly break the semantics of the code. E.g. if you do floating point maths, optimizing compilers that assume that x + y - x equals y are BROKEN. > With the local-variable-snapshot technique ("len = len"), can anything > be optimized, since the parser can guarantee that nothing ever > reassigns to it? Very likely. But the optimization happens at runtime, not at compile time. Default values for parameters are stored as an attribute of the function at definition time. So if you have this: def f(x, len=len): return len(x) then f gets a publicly accessible attribute func_defaults: >>> f.func_defaults (<built-in function len>,) Even if you re-bind len, that doesn't change the binding here, and the *local* variable len will be assigned that value when f is called. Here's a practical example: >>> def test(len=len): ... while True: ... yield len ... >>> >>> it = test() >>> next(it) <built-in function len> >>> len = None # rebind the global len >>> next(it) <built-in function len> >>> test.func_defaults = (None, ) >>> next(it) <built-in function len> But note that now as I've replaced the default value, the next time I call test() I'll get len=None. [...] > So... Would this potentially produce wrong results? Would it be of any > use, or would its benefit be only valued in times when the whole > function needs to be redone in C? You really would need to demonstrate that the bottleneck in useful code (as opposed to toy examples) was the name lookups. -- Steven -- http://mail.python.org/mailman/listinfo/python-list