Re: [Python-Dev] Idea: Dictionary references
On Thu, Dec 17, 2015 at 09:30:24AM -0800, Andrew Barnert via Python-Dev wrote: > On Dec 17, 2015, at 07:38, Franklin? Lee> wrote: > > > > The nested dictionaries are only for nested scopes (and inner > > functions don't create nested scopes). Nested scopes will already > > require multiple lookups in parents. > > I think I understand what you're getting at here, but it's a really > confusing use of terminology. In Python, and in programming in > general, nested scopes refer to exactly inner functions (and classes) > being lexically nested and doing lookup through outer scopes. The fact > that this is optimized at compile time to FAST vs. CELL vs. > GLOBAL/NAME, cells are optimized at function-creation time, and only > global and name have to be resolved at the last second doesn't mean > that there's no scoping, or some other form of scoping besides > lexical. The actual semantics are LEGB, even if L vs. E vs. GB and E > vs. further-out E can be optimized. In Python 2, the LOAD_NAME byte-code can return a local, even though it normally doesn't: py> x = "global" py> def spam(): ... exec "x = 'local'" ... print x ... py> spam() local py> x == 'global' True If we look at the byte-code, we see that the lookup is *not* optimized to inspect locals only (LOAD_FAST), but uses the regular LOAD_NAME that normally gets used for globals and builtins: py> import dis py> dis.dis(spam) 2 0 LOAD_CONST 1 ("x = 'local'") 3 LOAD_CONST 0 (None) 6 DUP_TOP 7 EXEC_STMT 3 8 LOAD_NAME0 (x) 11 PRINT_ITEM 12 PRINT_NEWLINE 13 LOAD_CONST 0 (None) 16 RETURN_VALUE > What you're talking about here is global lookups falling back to > builtin lookups. There's no more general notion of nesting or scoping > involved, so why use those words? I'm not quite sure about this. In principle, every name lookup looks in four scopes, LEGB as you describe above: - locals - non-locals, a.k.a. enclosing or lexical scope(s) - globals (i.e. the module) - builtins although Python can (usually?) optimise away some of those lookups. The relationship of locals to enclosing scopes, and to globals in turn, involve actual nesting of indented blocks in Python, but that's not necessarily the case. One might imagine a hypothetical capability for manipulating scopes directly, e.g.: def spam(): ... def ham(): ... set_enclosing(ham, spam) # like: # def spam(): # def ham(): ... The adventurous or fool-hardy can probably do something like that now with byte-code hacking :-) Likewise, one might consider that builtins is a scope which in some sense encloses the global scope. Consider it a virtual code block that is outdented from the top-level scope :-) > So, trying to generalize global vs. builtin to a general notion of > "nested scope" that isn't necessary for builtins and doesn't work for > anything else seems like overcomplicating things for no benefit. Well, putting aside the question of whether this is useful or not, and putting aside efficiency concerns, let's just imagine a hypothetical implementation where name lookups used ChainMaps instead of using separate LOAD_* lookups of special dicts. Then a function could set up a ChainMap: function.__scopes__ = ChainMap(locals, enclosing, globals, builtins) and a name lookup for (say) "x" would always be a simple: function.__scopes__["x"] Of course this would be harder to optimize, and hence probably slower, than the current arrangement, but I think it would allow some interesting experiments with scoping rules: ChainMap(locals, enclosing, globals, application_globals, builtins) You could implement dynamic scoping by inserting the caller's __scopes__ ChainMap into the front of the called function's ChainMap. And attribute lookups would be something like this simplified scope: ChainMap(self.__dict__, type(self).__dict__) to say nothing of combinations of the two. So I think there's something interesting here, even if we don't want to use it in production code, it would make for some nice experiments. -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
> On Dec 18, 2015, at 04:56, Steven D'Apranowrote: > >>> On Thu, Dec 17, 2015 at 09:30:24AM -0800, Andrew Barnert via Python-Dev >>> wrote: >>> On Dec 17, 2015, at 07:38, Franklin? Lee >>> wrote: >>> >>> The nested dictionaries are only for nested scopes (and inner >>> functions don't create nested scopes). Nested scopes will already >>> require multiple lookups in parents. >> >> I think I understand what you're getting at here, but it's a really >> confusing use of terminology. In Python, and in programming in >> general, nested scopes refer to exactly inner functions (and classes) >> being lexically nested and doing lookup through outer scopes. The fact >> that this is optimized at compile time to FAST vs. CELL vs. >> GLOBAL/NAME, cells are optimized at function-creation time, and only >> global and name have to be resolved at the last second doesn't mean >> that there's no scoping, or some other form of scoping besides >> lexical. The actual semantics are LEGB, even if L vs. E vs. GB and E >> vs. further-out E can be optimized. > > In Python 2, the LOAD_NAME byte-code can return a local, even though it > normally doesn't: > > py> x = "global" > py> def spam(): > ... exec "x = 'local'" > ... print x > ... > py> spam() > local > py> x == 'global' > True > > > If we look at the byte-code, we see that the lookup is *not* optimized > to inspect locals only (LOAD_FAST), but uses the regular LOAD_NAME that > normally gets used for globals and builtins: > > py> import dis > py> dis.dis(spam) > 2 0 LOAD_CONST 1 ("x = 'local'") > 3 LOAD_CONST 0 (None) > 6 DUP_TOP > 7 EXEC_STMT > > 3 8 LOAD_NAME0 (x) > 11 PRINT_ITEM > 12 PRINT_NEWLINE > 13 LOAD_CONST 0 (None) > 16 RETURN_VALUE > > > >> What you're talking about here is global lookups falling back to >> builtin lookups. There's no more general notion of nesting or scoping >> involved, so why use those words? > > I'm not quite sure about this. In principle, every name lookup looks in > four scopes, LEGB as you describe above: > > - locals > - non-locals, a.k.a. enclosing or lexical scope(s) > - globals (i.e. the module) > - builtins > > > although Python can (usually?) optimise away some of those lookups. I think it kind of _has_ to optimize away, or at least tweak, some of those things, rather than just acting as if globals and builtins were just two more enclosing scopes. For example, global to builtins has to go through globals()['__builtins__'], or act as if it does, or code that relies on, say, the documented behavior of exec can be broken. And you have to be able to modify the global scope after compile time and have that modification be effective, which means you'd have to allow the same things on locals and closures if they were to act the same. > The > relationship of locals to enclosing scopes, and to globals in turn, > involve actual nesting of indented blocks in Python, but that's not > necessarily the case. One might imagine a hypothetical capability for > manipulating scopes directly, e.g.: > > def spam(): ... > def ham(): ... > set_enclosing(ham, spam) > # like: > # def spam(): > # def ham(): ... But that doesn't work; a closure has to link to a particular invocation of its outer function, not just to the function. Consider a trivial example: def spam(): x=time() def ham(): return x set_enclosing(ham, spam) ham() There's no actual x value in scope. So you need something like this if you want to actually be able to call it: def spam(helper): x=time() helper = bind_closure(helper, sys._getframe()) return helper() def ham(): return x set_enclosing(ham, spam) spam(ham) Of course you could make that getframe implicit; the point is there has to be a frame from an invocation of spam, not just the function itself, to make lexical scoping (errr... dynamically-generated fake-lexical scoping?) useful. > The adventurous or fool-hardy can probably do something like that now > with byte-code hacking :-) Yeah; I actually played with something like this a few years ago. I did it directly in terms of creating cell and free vars, not circumventing the existing LEGB system, which means you have to modify not just ham, but spam, in that set_enclosing. (In fact, you also have to modify all functions lexically or faux-lexically enclosing spam or enclosed by ham, which my code didn't do, but there were lots of other ways to fake it...). You need a bit of ctypes.pythonapi, not just bytecode hacks, to do the bind_closure() hack (the cell constructor isn't callable from Python, and you can't even fake it with a wrapper around a cell because cell_contents is immutable from Python...), but it's all doable. Anyway, my original goal was to make it possible to get the
Re: [Python-Dev] Idea: Dictionary references
On Fri, Dec 18, 2015 at 2:32 PM, Andrew Barnert via Python-Devwrote: > (Also, either way, it seems more like a thread for -ideas than -dev...) I said this early on in this thread! Should I try to write up my idea as a single thing, instead of a bunch of responses, and post it in -ideas? Should I call them "parent scope" and "parent refcell"? On Fri, Dec 18, 2015 at 7:56 AM, Steven D'Aprano wrote: > I'm not quite sure about this. In principle, every name lookup looks in > four scopes, LEGB as you describe above: > > - locals > - non-locals, a.k.a. enclosing or lexical scope(s) > - globals (i.e. the module) > - builtins > > > although Python can (usually?) optimise away some of those lookups. The > relationship of locals to enclosing scopes, and to globals in turn, > involve actual nesting of indented blocks in Python, but that's not > necessarily the case. As I understand, L vs E vs GB is known at compile-time. That is, your exec example doesn't work for me in Python 3, because all names are scoped at compile-time. x = 5 def f(): exec('x = 111') print(x) f() #prints 5 print(x) #prints 5 This means that my idea only really works for GB lookups. > On Thu, Dec 17, 2015 at 09:30:24AM -0800, Andrew Barnert via Python-Dev wrote: >> So, trying to generalize global vs. builtin to a general notion of >> "nested scope" that isn't necessary for builtins and doesn't work for >> anything else seems like overcomplicating things for no benefit. > > Well, putting aside the question of whether this is useful or not, and > putting aside efficiency concerns, let's just imagine a hypothetical > implementation where name lookups used ChainMaps instead of using > separate LOAD_* lookups of special dicts. Then a function could set up a > ChainMap: > > function.__scopes__ = ChainMap(locals, enclosing, globals, builtins) > > and a name lookup for (say) "x" would always be a simple: > > function.__scopes__["x"] > > Of course this would be harder to optimize, and hence probably slower, > than the current arrangement, This is where the ChainRefCell idea comes in. If a ChainRefCell is empty, it would ask its parent dicts for a value. If it finds a value in parent n, it would replace parent n with a refcell into parent n, and similarly for parents 0, 1, ... n-1. It won't need to do hash lookups in those parents again, while allowing for those parents to acquire names. (This means parent n+1 won't need to create refcells, so we don't make unnecessary refcells in `object` and `__builtin__`.) Unfortunately, classes are more complicated than nested scopes. 1. We skip MRO if we define classes as having their direct supers as parents. (Solution: Define classes as having all supers as parents, and make non-recursive Refcell.resolve() requests.) (Objects have their class as a parent, always.) 2. Classes can replace their bases. (I have some ideas for this, but see #3.) 3. I get the impression that attribute lookups are already pretty optimized. On Fri, Dec 18, 2015 at 2:32 PM, Andrew Barnert via Python-Dev wrote: > I think it kind of _has_ to optimize away, or at least tweak, some of those > things, rather than just acting as if globals and builtins were just two more > enclosing scopes. For example, global to builtins has to go through > globals()['__builtins__'], or act as if it does, or code that relies on, say, > the documented behavior of exec can be broken. It would or could, in my idea of __builtins__ being a parent scope of globals() (though I'm not sure whether it'd be the case for any other kind of nesting). Each refcell in globals() will hold a reference to __builtins__ (if they didn't successfully look it up yet) or to a refcell in __builtins__ (if there was once a successful lookup). Since globals() knows when globals()['__builtins__'] is modified, it can invalidate all its refcells' parent cells (by making them hold references to the new __builtins__). This will be O(len(table) + (# of refcells)), but swapping out __builtins__ shouldn't be something you keep doing. Even if it is a concern, I have More Ideas to remove the "len(table) +" (but with Raymond Hettinger's compact dicts, it wouldn't be necessary). It would be worse for classes, because it would require potentially many notifications. (But it would also save future lookups. And I have More Ideas.) This idea (of the owner dict "knowing" about its changed parent) also applies to general chained scopes, but flattenings like MRO would mess it up. Again, though, More Ideas. And more importantly, from what I understand of Victor's response, the current implementation would probably be efficient enough, or more efficient. > And you have to be able to modify the global scope after compile time and > have that modification be effective, which means you'd have to allow the same > things on locals and closures if they were to act the same. Not
Re: [Python-Dev] Idea: Dictionary references
You can very easily implement this with version tags on the globals dictionaries - means that the dictionaries have versions and the guard checking if everything is ok just checks the version tag on globals. Generally speaking, such optimizations have been done in the past (even in places like pypy, but also in literature) and as soon as we have dynamic compilation (and FAT is a form of it), you can do such tricks. On Thu, Dec 17, 2015 at 3:48 PM, Steven D'Apranowrote: > On Thu, Dec 17, 2015 at 12:53:13PM +0100, Victor Stinner quoted: >> 2015-12-17 11:54 GMT+01:00 Franklin? Lee : > >> > Each function keeps an indirect, automagically updated >> > reference to the current value of the names they use, > > Isn't that a description of globals()? If you want to look up a name > "spam", you grab an indirect reference to it: > > globals()["spam"] > > which returns the current value of the name "spam". > > >> > and will never need to look things up again.[*] > > How will this work? > > Naively, it sounds to me like Franklin is suggesting that on every > global assignment, the interpreter will have to touch every single > function in the module to update that name. Something like this: > > # on a global assignment > spam = 23 > > # the interpreter must do something like this: > for function in module.list_of_functions: > if "spam" in function.__code__.__global_names__: > function.__code__.__global_names__["spam"] = spam > > As I said, that's a very naive way to implement this. Unless you have > something much cleverer, I think this will be horribly slow. > > And besides, you *still* need to deal with the case that the name isn't > a global at all, but in the built-ins namespace. > > > -- > Steve > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Thu, Dec 17, 2015 at 12:53:13PM +0100, Victor Stinner quoted: > 2015-12-17 11:54 GMT+01:00 Franklin? Lee: > > Each function keeps an indirect, automagically updated > > reference to the current value of the names they use, Isn't that a description of globals()? If you want to look up a name "spam", you grab an indirect reference to it: globals()["spam"] which returns the current value of the name "spam". > > and will never need to look things up again.[*] How will this work? Naively, it sounds to me like Franklin is suggesting that on every global assignment, the interpreter will have to touch every single function in the module to update that name. Something like this: # on a global assignment spam = 23 # the interpreter must do something like this: for function in module.list_of_functions: if "spam" in function.__code__.__global_names__: function.__code__.__global_names__["spam"] = spam As I said, that's a very naive way to implement this. Unless you have something much cleverer, I think this will be horribly slow. And besides, you *still* need to deal with the case that the name isn't a global at all, but in the built-ins namespace. -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
(Previous thread was here, by the way: https://mail.python.org/pipermail/python-dev/2015-December/142437.html) On Thu, Dec 17, 2015 at 8:48 AM, Steven D'Apranowrote: > On Thu, Dec 17, 2015 at 12:53:13PM +0100, Victor Stinner quoted: >> 2015-12-17 11:54 GMT+01:00 Franklin? Lee : > >> > Each function keeps an indirect, automagically updated >> > reference to the current value of the names they use, > > Isn't that a description of globals()? If you want to look up a name > "spam", you grab an indirect reference to it: > > globals()["spam"] > > which returns the current value of the name "spam". The *current value*. I'm proposing that we instead do this: spamref = globals().getref('spam') Every time we want to find the current, updated value of 'spam', we just do spam = spamref.resolve() which will skip the hash lookup and go directly to the value. >> > and will never need to look things up again.[*] > > How will this work? > > Naively, it sounds to me like Franklin is suggesting that on every > global assignment, the interpreter will have to touch every single > function in the module to update that name. A refcell holds a pointer to the location in the dict itself where the value pointer is. When the value is updated in the dict, the refcell will not need to be updated. My original proposal wanted to keep cells in the "real" dict, and update them. Like so: class RefCell: __slots__ = ['value'] class ScopeDict(dict): def __getitem__(self, key): value = super()[key].value #may raise if value is NULL: raise KeyError(key) return value def __setitem__(self, key, value): if key in super(): super()[key].value = value else: cell = super()[key] = RefCell() cell.value = value def __delitem__(self, key, value): cell = super()[key] #may raise if cell.value is NULL: raise KeyError(key) cell.value = NULL I realized later that this isn't necessary. Most dict operations don't need to know about the indirection, so I make the inner dict a normal dict (with a few more holes than normal). But this would show how you can avoid manual updates for references. > And besides, you *still* need to deal with the case that the name isn't > a global at all, but in the built-ins namespace. globals() to __builtin__ is a nesting relationship. At the bottom of the following email, I have a pseudocode implementation which knows how to deal with nested scopes. https://mail.python.org/pipermail/python-dev/2015-December/142489.html ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
2015-12-17 16:38 GMT+01:00 Franklin? Lee: > Yeah, maybe it's out of the scope of bytecode optimization. But I > think this would make guards unnecessary, since once a name is found, > there's a quick way to refer to it. FAT Python requires and supports various kinds of guards: * type of a function argument * dictionary key * function (func.__code__) I guess that you are talking about the dictionary key guards. In fact, they are 4 subtypes of dict guards: * guard on builtins.__dict__['key'] * guard on globals()['key'] * guard on MyClass.__dict__['key'] * guard on dict['key'] The implementation of the check if this guard currently relies on a new "version" (global dict version, incremented at each change) to avoid lookup if possible. The guard stores a strong reference to the key and the value. If the value is different, the guard checks returns False. I don't understand how you plan to avoid guards. The purpose of guards is to respect the Python semantic by falling back to the "slow" bytecode if something changes. So I don't understand your idea of avoiding completly guards. Again, that's why I guess that it's unrelated to FAT Python... Or it's just that your idea is an alternative to dict versionning to get efficient guards on dict keys, right? > Guards would also have an issue > with nested scopes. You have a note on your website about it: > (https://faster-cpython.readthedocs.org/fat_python.html#call-pure-builtins) > > "The optimization is disabled when the builtin function is > modified or if a variable with the same name is added to the global > namespace of the function." FAT Python doesn't emit a specialized version if it requires a builtin function, but a local variable with the same name is found. The check is done in the current function but also in the parent namespaces, up to the global namespace. I'm talking about the static analysis of the code. If the specialized version is built, a guard is created on the builtin namespace and another on the global namespace. Sorry, I don't understand the "problem" you are referring to. Can you maybe show an example of code where FAT Python doesn't work or where you idea can help? > RefCells "know" whether they're part of a living dict. (The dict marks > them as such upon its death.) If they are not, they will decref their > value upon their death. They do not hold a reference to their owner > dict. The dict contains the list all of its "own" RefCell objects, right? > If it's part of a living dict, we have a choice: the dict can be > responsible for deletion, or the RefCell can be responsible for > deletion. It doesn't really matter which design we go with. I see your RefCell idea like dictionary views. Views are linked to the dict. If the dict is modified, views are "updated" too. It would be confusing to have a view disconnected from its container. In short, RefCell is a view on a single dict entry, to be able to "watch" a dict entry without the cost of a lookup? And a RefCell can be created, even if the dict entry doesn't exist right? Hum, creating many RefCell introduces an overhead somewhere. For example, deleting a key has to update N RefCell objects linked to this key, right? So del dict[x] takes O(number of RefCell), right? > If a dict key is removed, the inner dict will still have the key > (which is true in the current implementation), but the value will be > decref'd and the value pointer will be NULL'd. The RefCell will not > need to be updated, since (as part of a living dict) it's pointing at > the pointer to the value, not the object itself. Ok, so deleting a dict key always destroy the value, but the key may stay alive longer than expected (until all RefCell objects are destroyed). Usually, dict keys are constants, so keeping them alive doesn't matter so much. It's rare to have large keys. > If it is detached from its owner dict (due to owner death), it will > own a strong reference to the value. This is necessarily the case, > since things that have a (strong) reference to the RefCell expect to > find the value (or lack of a value) there. I don't understand this part. You said that deleting a key destroys the value. Destroying a dict means clearing all keys, so destroying all values. No? What is the use case of having a RefCell no more connected to the dict? Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Thu, Dec 17, 2015 at 6:53 AM, Victor Stinnerwrote: > 2015-12-17 11:54 GMT+01:00 Franklin? Lee : >> My suggestion should improve *all* function calls which refer to >> outside names. > > Ok, I now think that you should stop hijacking the FAT Python thread. > I start a new thread. IMHO your dictionary reference idea is completly > unrelated to FAT Python. > > FAT Python is about guards and specialized bytecode. Yeah, maybe it's out of the scope of bytecode optimization. But I think this would make guards unnecessary, since once a name is found, there's a quick way to refer to it. >> Each function keeps an indirect, automagically updated >> reference to the current value of the names they use, and will never >> need to look things up again.[*] > > Indirections, nested dictionaries, creation of new "reference" > objects... IMHO you are going to have major implementation issues :-/ > The design looks *very* complex. I'm quite sure that you are going to > make namespace lookups *slower*. The nested dictionaries are only for nested scopes (and inner functions don't create nested scopes). Nested scopes will already require multiple lookups in parents. I think this is strictly an improvement, except perhaps in memory. Guards would also have an issue with nested scopes. You have a note on your website about it: (https://faster-cpython.readthedocs.org/fat_python.html#call-pure-builtins) "The optimization is disabled when the builtin function is modified or if a variable with the same name is added to the global namespace of the function." With a NestedRefCell, it would check globals() (a simple dereference and `pointer != NULL`) and then check __builtin__. If it finds it in __builtin__, it will save a reference to that. It will only do repeated lookups in __builtin__ if each of the previous lookups fail. As far as I know, these repeated lookups are already necessary, and anything that can be used to avoid them (e.g. guards) can be used for repeated failed lookups, too. For non-nested scopes, it will look things up once, costing an extra RefCell creation if necessary, and the only other costs are on resizing, deletion of the dict, and working with a larger dict in general. The important parts of the design is pretty much in the code that I posted. We keep an extra hash table for refs, and keep it the same size as the original hash table, so that we pay a single lookup cost to get the index in both. > It reminds me Python before the with statement and PyPy garbage > collector. Many applications relied on the exact behaviour of CPython > garbage collector. For example, they expected that a file is written > on disk when the last reference to the file object is destroyed. In > PyPy, it wasn't (it isn't) true, the write can be delayed. It should not affect the semantic. Things should still happen as they used to, as far as I can figure. Or at least as far as the rules of the interpreter are concerned. (That is, values might live a little longer in PyPy, but can't be forced to live longer than they were formerly allowed to.) > I guess that with all your complex machinery for dict lookups, The only cost to a normal getitem (again, other than from looking it up in a bigger dict) is to make sure the return value isn't NULL. The machinery is involved in function creation and resolution of names: On function creation, get refs to each name used. When the name is used, try to resolve the refs. >you > will have similar issues of object lifetime. It's unclear to me when > and how "reference" objects are destroyed, nor when dict values are > destroyed. RefCells are ref-counted PyObjects. That is not an issue. A RefCell will live while it is useful (= it has an outside reference) or while it's not useful but its owner dict hasn't been resized/deleted yet (at which time RefCells without outside references will be deleted). RefCells "know" whether they're part of a living dict. (The dict marks them as such upon its death.) If they are not, they will decref their value upon their death. They do not hold a reference to their owner dict. If it's part of a living dict, we have a choice: the dict can be responsible for deletion, or the RefCell can be responsible for deletion. It doesn't really matter which design we go with. > What happens if a dict key is removed and a reference object is still > alive? Is the dict value immediatly destroyed? Does the reference > object contain a strong or a weak reference to the value? If a dict key is removed, the inner dict will still have the key (which is true in the current implementation), but the value will be decref'd and the value pointer will be NULL'd. The RefCell will not need to be updated, since (as part of a living dict) it's pointing at the pointer to the value, not the object itself. If it is detached from its owner dict (due to owner
Re: [Python-Dev] Idea: Dictionary references
On Dec 17, 2015, at 07:38, Franklin? Leewrote: > > The nested dictionaries are only for nested scopes (and inner > functions don't create nested scopes). Nested scopes will already > require multiple lookups in parents. I think I understand what you're getting at here, but it's a really confusing use of terminology. In Python, and in programming in general, nested scopes refer to exactly inner functions (and classes) being lexically nested and doing lookup through outer scopes. The fact that this is optimized at compile time to FAST vs. CELL vs. GLOBAL/NAME, cells are optimized at function-creation time, and only global and name have to be resolved at the last second doesn't mean that there's no scoping, or some other form of scoping besides lexical. The actual semantics are LEGB, even if L vs. E vs. GB and E vs. further-out E can be optimized. What you're talking about here is global lookups falling back to builtin lookups. There's no more general notion of nesting or scoping involved, so why use those words? Also, reading your earlier post, it sounds like you're trying to treat attribute lookup as a special case of global lookup, only with a chain of superclasses beyond the class instead of just a single builtins. But they're totally different. Class lookup doesn't just look in a series of dicts, it calls __getattribute__ which usually calls __getattr__ which may or may not look in the __dict__s (which may not even exist) to find a descriptor and then calls its __get__ method to get the value. You'd have to somehow handle the case where the search only went through object.__getattribute__ and __getattr__ and found a result by looking in a dict, to make a RefCell to that dict which is marked in some way that says "I'm not a value, I'm a descriptor you have to call each time", and then apply some guards that will detect whether that class or any intervening class dict touched that key, whether the MRO changed, whether that class or any intervening class added or changed implementations for __getatttibute__ or __getattr__, and probably more things I haven't thought of. What do those guards look like? (Also, you need a different set of rules to cache, and guard for, special method lookup--you could just ignore that, but I think those are the lookups that would benefit most from optimization.) So, trying to generalize global vs. builtin to a general notion of "nested scope" that isn't necessary for builtins and doesn't work for anything else seems like overcomplicating things for no benefit. > I think this is strictly an > improvement, except perhaps in memory. Guards would also have an issue > with nested scopes. You have a note on your website about it: > (https://faster-cpython.readthedocs.org/fat_python.html#call-pure-builtins) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Thu, Dec 17, 2015 at 11:50 AM, Victor Stinnerwrote: > I don't understand how you plan to avoid guards. The purpose of guards > is to respect the Python semantic by falling back to the "slow" > bytecode if something changes. So I don't understand your idea of > avoiding completly guards. Again, that's why I guess that it's > unrelated to FAT Python... Yeah, I guess it is. Maybe we could've moved to python-ideas. As far as I understand, this concept can be put into CPython. (Note to self: Look up how PyPy handles repeated lookups.) >> Guards would also have an issue >> with nested scopes. You have a note on your website about it: >> (https://faster-cpython.readthedocs.org/fat_python.html#call-pure-builtins) >> >> "The optimization is disabled when the builtin function is >> modified or if a variable with the same name is added to the global >> namespace of the function." > > FAT Python doesn't emit a specialized version if it requires a builtin > function, but a local variable with the same name is found. > > The check is done in the current function but also in the parent > namespaces, up to the global namespace. I'm talking about the static > analysis of the code. > > If the specialized version is built, a guard is created on the builtin > namespace and another on the global namespace. > > Sorry, I don't understand the "problem" you are referring to. Can you > maybe show an example of code where FAT Python doesn't work or where > you idea can help? If we have to look in scopes A, B, C in order, where the name is in C but never in B, and there's no nesting relationship between B and C. In that case, I do not create a refcell in B chained to C (because there's no relationship), so I keep doing lookups in B. That's the problem. For that, guards and versioning can prevent unnecessary lookups in B. Though I think a better solution might be: If a name is found in C, create empty refcells in B and A (i.e. in all previous dicts). (There's a nesting relationship between globals() and __builtin__, so that's fine.) >> RefCells "know" whether they're part of a living dict. (The dict marks >> them as such upon its death.) If they are not, they will decref their >> value upon their death. They do not hold a reference to their owner >> dict. > > The dict contains the list all of its "own" RefCell objects, right? It contains a table of pointers. The pointers are to RefCell objects or NULL. The refs table is exactly the same size as the internal hash table. This makes indexing it efficient: to find the pointer to a refcell, find the index of the key in the hash table, then use that SAME index on the refs table. You never need to find a refcell without also finding its hash index, so this is cheap. > In short, RefCell is a view on a single dict entry, to be able to > "watch" a dict entry without the cost of a lookup? And a RefCell can > be created, even if the dict entry doesn't exist right? My "implementation", which had nesting and recursion in mind, had a "create_if_none" parameter, which meant that the requester can ask for it to be created even if the key didn't exist in the table. Pre-creation is useful for functions which refer to globals() names before they're defined. No-creation is useful in... I can only think of nesting as a use (globals() -> __builtin__ shouldn't create empty cells in __builtin__). See `getref` in here: https://mail.python.org/pipermail/python-dev/2015-December/142489.html > Hum, creating many RefCell introduces an overhead somewhere. For > example, deleting a key has to update N RefCell objects linked to this > key, right? So del dict[x] takes O(number of RefCell), right? There are no "outside" updates, except when a dict moves to a different internal table or deletes its internal table. In that case, the dict has to move and update each exposed RefCell. For each dict, for each key, there is at most one RefCell. As long as the dict is alive, that RefCell will hold a pointer to the pointer to the value (enforced by the dict). When the dict dies, it makes the Refcell point to the object, and tells the RefCell it's free (so it's in charge of cleaning up its value). Dict entries look like this. typedef struct { /* Cached hash code of me_key. */ Py_hash_t me_hash; PyObject *me_key; PyObject *me_value; /* This field is only meaningful for combined tables */ } PyDictKeyEntry; The internal table (which the ref table will sync with) is an array of PyDictKeyEntrys. (Raymond Hettinger made a design with a smaller table, and the hash lookup would be into an array of indices. This would make synced metadata tables both easier and smaller. See https://mail.python.org/pipermail/python-dev/2012-December/123028.html and latest relevant discussion https://mail.python.org/pipermail/python-ideas/2015-December/037468.html ) The refcell will hold this: RefCell(_value) A pointer to the field, not to the
Re: [Python-Dev] Idea: Dictionary references
On Thu, Dec 17, 2015 at 12:30 PM, Andrew Barnertwrote: > On Dec 17, 2015, at 07:38, Franklin? Lee > wrote: >> >> The nested dictionaries are only for nested scopes (and inner >> functions don't create nested scopes). Nested scopes will already >> require multiple lookups in parents. > > I think I understand what you're getting at here, but it's a really confusing > use of terminology. In Python, and in programming in general, nested scopes > refer to exactly inner functions (and classes) being lexically nested and > doing lookup through outer scopes. The fact that this is optimized at compile > time to FAST vs. CELL vs. GLOBAL/NAME, cells are optimized at > function-creation time, and only global and name have to be resolved at the > last second doesn't mean that there's no scoping, or some other form of > scoping besides lexical. The actual semantics are LEGB, even if L vs. E vs. > GB and E vs. further-out E can be optimized. Oh, I've never actually read the Python scoping rules spelled out. I wasn't sure if there were other cases. The other two cases I thought of as "nesting" were: object to its class, and class to its superclasses. > Also, reading your earlier post, it sounds like you're trying to treat > attribute lookup as a special case of global lookup, only with a chain of > superclasses beyond the class instead of just a single builtins. But they're > totally different. Class lookup doesn't just look in a series of dicts, it > calls __getattribute__ which usually calls __getattr__ which may or may not > look in the __dict__s (which may not even exist) to find a descriptor and > then calls its __get__ method to get the value. You'd have to somehow handle > the case where the search only went through object.__getattribute__ and > __getattr__ and found a result by looking in a dict, to make a RefCell to > that dict which is marked in some way that says "I'm not a value, I'm a > descriptor you have to call each time", and then apply some guards that will > detect whether that class or any intervening class dict touched that key, > whether the MRO changed, whether that class or any intervening class added or > changed implementations f or __getatttibute__ or __getattr__, and probably more things I haven't thought of. What do those guards look like? (Also, you need a different set of rules to cache, and guard for, special method lookup--you could just ignore that, but I think those are the lookups that would benefit most from optimization.) Doesn't __getattr__ only get called if all the mro __dict__ lookups failed? I forgot about __getattribute__. That might be the point at which refs are optimized. As for descriptors versus RefCells, I'm guessing that can be resolved, as soon as I figure out how descriptors actually work... If descriptors don't modify the __dict__, then RefCells shouldn't get involved. If they do, then there's some unwrapping going on there, and RefCells should fit right in (though whether they'll improve anything is a different question). RefCells are just a shortcut for dict lookups. For guards, I think Victor Stinner's idea could supplement this. Alternatively, in my other email, I said there could be a rule of, "Create intermediate RefCells for anything BEFORE a successful lookup." So if we look in A, B, C, D, and find it in C, then we create and save RefCells in A, B, C, but not D (where D = object). This MIGHT result in a lot of intermediate RefCells, but I'd guess most things aren't looked up just once, and it's saying, "It's possible for B to gain member B.x and catch me on my way to C.x." > So, trying to generalize global vs. builtin to a general notion of "nested > scope" that isn't necessary for builtins and doesn't work for anything else > seems like overcomplicating things for no benefit. Probably. The globals() and __builtin__ case is simpler than the class case. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Thursday, December 17, 2015 11:19 AM, Franklin? Leewrote: > ... > as soon as I figure out how descriptors actually work... I think you need to learn what LOAD_ATTR and the machinery around it actually does before I can explain why trying to optimize it like globals-vs.-builtins doesn't make sense. Maybe someone who's better at explaining than me can come up with something clearer than the existing documentation, but I can't. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
Le vendredi 18 décembre 2015, Andrew Barnert via Python-Dev < python-dev@python.org> a écrit : > > >> Builtins do two dict lookups. > > > > Two? > > Actually, I guess three: first you fail to find the name in globals, then > you find __builtins__ in globals, then you find the name in __builtins__ or > __builtins__.__dict__. > Getting builtins from globals in done when the frame object is created, and the lookup is skipped in the common case if qu recall correctly. LOAD_NAME does a lookup on function globals, if the key doesn't exist (common case for builtins), a ceond lookup is done in frame builtins. Open Python/ceval.c and see to code. There is an optimisation for fast lookup in two dict. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
2015-12-17 23:17 GMT+01:00 Andrew Barnert via Python-Dev: > Builtins do two dict lookups. > > So, the only thing you can optimize there is builtins. But maybe that's worth > it. FYI I implemented an optimization in FAT Python to avoid lookups for builtin functions, builtin functions are copied to code constants at runtime: https://faster-cpython.readthedocs.org/fat_python.html#copy-builtin-functions-to-constants It's nothing new, it's the generalization of common hacks, like 'def func(len=len): return len(3)'. The optimization is restricted to loading builtin symbols which are not expected to be modified ;-) (Right now the optimization is disabled by default, because the optimizer is unable to detect when builtins are modified in the current function, and so it breaks the Python semantic.) > Class attributes (including normal methods, @property, etc.) do two or more > dict lookups--first the instance, then the class, then each class on the > class's MRO. Note: Types have an efficient cache for name lookups ;-) Thanks for this cache, it's no more an issue to have a deep hierarchy of classes. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
I already know that we can't use recursion, because it bypasses MRO. I'm also not yet sure whether it makes sense to use refs for classes in the first place. As I understood it, an attribute will resolve in this order: - __getattribute__ up the MRO. (raises AttributeError) - __dict__ up the MRO. (raises KeyError) - __getattr__ up the MRO. (raises AttributeError) My new understanding: - __getattribute__. (raises AttributeError) - (default implementation:) __dict__.__getitem__. (raises KeyError) - __getattr__ up the MRO. (raises AttributeError) If this is the case, then (the default) __getattribute__ will be making the repeated lookups, and might be the one requesting the refcells (for the ones it wants). Descriptors seem to be implemented as: Store a Descriptor object as an attribute. When a Descriptor is accessed, if it is being accessed from its owner, then unbox it and use its methods. Otherwise, it's a normal attribute. Then Descriptors are in the dict, so MIGHT benefit from refcells. The memory cost might be higher, though. On Thu, Dec 17, 2015 at 5:17 PM, Andrew Barnertwrote: > On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Dev > wrote: >> >> On Thursday, December 17, 2015 11:19 AM, Franklin? Lee >> wrote: >> >> >>> ... >>> as soon as I figure out how descriptors actually work... >> >> >> I think you need to learn what LOAD_ATTR and the machinery around it >> actually does before I can explain why trying to optimize it like >> globals-vs.-builtins doesn't make sense. Maybe someone who's better at >> explaining than me can come up with something clearer than the existing >> documentation, but I can't. > > I take that back. First, it was harsher than I intended. Second, I think I > can explain things. I appreciate it! Tracking function definitions in the source can make one want to do something else. > First, for non-attribute lookups: > > (Non-shared) locals just load and save from an array. > > Free variables and shared locals load and save by going through an extra > dereference on a cell object in an array. In retrospect, of course they do. It sounds like the idea is what's already used there, except the refs are synced to the locals array instead of a hash table. > Globals do a single dict lookup. A single dict lookup per function definition per name used? That's what I'm proposing. For example, (and I only just remembered that comprehensions and gen expressions create scope) [f(x) for x in range(1)] would look up the name `f` at most twice (once in globals(), once in builtins() if needed), and will always have the latest version of `f`. And if it's in a function, the refcell(s) would be saved by the function. > Builtins do two dict lookups. Two? > Class attributes (including normal methods, @property, etc.) do two or more > dict lookups--first the instance, then the class, then each class on the > class's MRO. Then, if the result has a __get__ method, it's called with the > instance and class to get the actual value. This is how bound methods get > created, property lookup functions get called, etc. The result of the > descriptor call can't get cached (that would mean, for example, that every > time you access the same @property on an instance, you'd get the same value). Yeah, I would only try to save in a dict lookup to get the descriptor, and I'm not sure it's worth it. (Victor's response says that class attributes are already efficient, though.) > So, if the globals->builtins optimization is worth doing, don't tie it to > another optimization that's much more complicated and less useful like this, > or we'll never get your simple and useful idea. Sure. I couldn't figure out where to even save the refcells for attributes, so I only really saw an opportunity for name lookups. Since locals and nonlocals don't require dict lookups, this means globals() and __builtin__. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Devwrote: > > On Thursday, December 17, 2015 11:19 AM, Franklin? Lee > wrote: > > >> ... >> as soon as I figure out how descriptors actually work... > > > I think you need to learn what LOAD_ATTR and the machinery around it actually > does before I can explain why trying to optimize it like globals-vs.-builtins > doesn't make sense. Maybe someone who's better at explaining than me can come > up with something clearer than the existing documentation, but I can't. I take that back. First, it was harsher than I intended. Second, I think I can explain things. First, for non-attribute lookups: (Non-shared) locals just load and save from an array. Free variables and shared locals load and save by going through an extra dereference on a cell object in an array. Globals do a single dict lookup. Builtins do two dict lookups. So, the only thing you can optimize there is builtins. But maybe that's worth it. Next, for attribute lookups (not counting special methods): Everything calls __getattribute__. Assuming that's not overridden and uses the object implementation: Instance attributes do one dict lookup. Class attributes (including normal methods, @property, etc.) do two or more dict lookups--first the instance, then the class, then each class on the class's MRO. Then, if the result has a __get__ method, it's called with the instance and class to get the actual value. This is how bound methods get created, property lookup functions get called, etc. The result of the descriptor call can't get cached (that would mean, for example, that every time you access the same @property on an instance, you'd get the same value). Dynamic attributes from a __getattr__ do all that plus whatever __getattr__ does. If __getattribute__ is overloaded, it's entirely up to that implementation to do whatever it wants. Things are similar for set and del: they call __setattr__/__delattr__, and the default versions of those look in the instance dict first, then look for a descriptor the same as with get except that they call a different method on the descriptor (and if it's not a descriptor, instead of using it, they ignore it and go back to the instance dict). So, your mechanism can't significantly speed up method lookups, properties, or most other things. It could speed up lookups for class attributes that aren't descriptors, but only at the cost of increasing the size of every instance--and how often do those matter anyway? A different mechanism that cached references to descriptors instead of to the resulting attributes could speed up method lookups, etc., but only by a very small amount, and with the same space cost. A mechanism that didn't try to get involved with the instance dict, and just flattened out the MRO search once that failed (and was out of the way before the descriptor call or __getattr__ even entered the picture) might speed methods up in deeply nested hierarchies, and with only a per-class rather than a per-instance space cost. But how often do you have deeply-nested hierarchies? And the speedup still isn't going to be that big: You're basically turning 5 dict lookups plus 2 method calls into 2 dict lookups plus 2 method calls. And it would still be much harder to guard than the globals dict: if any superclass changes its __bases__ or adds or removes a __getattribute__ or various other things, all of your references have to get re-computed. That's rare enough that the speed may not matter, but the code complexity probably does. If short: if you can't cache the bound methods (and as far as I can tell, in general you can't--even though 99% of the time it would work), I don't think there's any other significant win here. So, if the globals->builtins optimization is worth doing, don't tie it to another optimization that's much more complicated and less useful like this, or we'll never get your simple and useful idea. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Thu, Dec 17, 2015 at 6:41 PM, Franklin? Leewrote: > Then Descriptors are in the dict, so MIGHT benefit from refcells. The > memory cost might be higher, though. Might be worse than the savings, I mean. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
On Dec 17, 2015, at 15:41, Franklin? Leewrote: > > I already know that we can't use recursion, because it bypasses MRO. > I'm also not yet sure whether it makes sense to use refs for classes > in the first place. > > As I understood it, an attribute will resolve in this order: > - __getattribute__ up the MRO. (raises AttributeError) > - __dict__ up the MRO. (raises KeyError) > - __getattr__ up the MRO. (raises AttributeError) > > > My new understanding: > - __getattribute__. (raises AttributeError) >- (default implementation:) __dict__.__getitem__. (raises KeyError) > - __getattr__ up the MRO. (raises AttributeError) No, still completely wrong. If __getattribute__ raises an AttributeError (or isn't found, but that only happens in special cases like somehow calling a method on a type that hasn't been constructed), that's the end of the line; there's no fallback, and everything else (IIRC: searching MRO dicts for data descriptors, searching the instance dict, searching MRO dicts for non-data descriptors or non-descriptors, special-method-lookup-and-call __getattr__, raise AttributeError... and then doing the appropriate descriptor call at the end if needed). I was going to say that the only custom __getattribute__ you'll find in builtins or stdlib is on type, which does the exact same thing except when it calls a descriptor it does __get__(None, cls) instead of __get__(obj, type(obj)), and if you find any third-party __getattribute__ you should just assume it's going to do something crazy and don't bother trying to help it. But then I remembered that super must have a custom __getattribute__, so... you'd probably need to search the code for others. > If this is the case, then (the default) __getattribute__ will be > making the repeated lookups, and might be the one requesting the > refcells (for the ones it wants). Yes, the default and type __getattribute__ are what you'd want to optimize, if anything. And maybe special-method lookup. > Descriptors seem to be implemented as: >Store a Descriptor object as an attribute. When a Descriptor is > accessed, if it is being accessed from its owner, then unbox it and > use its methods. Otherwise, it's a normal attribute. Depending on what you mean by "owner", I think you have that backward. If the instance itself stores a descriptor, it's just used as itself; if the instance's _type_ (or a supertype) stores one, it's called to get the instance attribute. > Then Descriptors are in the dict, so MIGHT benefit from refcells. The > memory cost might be higher, though. Same memory cost. They're just objects whose type's dicts happen to have a __get__ method (just like iterables are just objects whose type's dicts happen to have an __iter__ method). The point is that you can't cache the result of the descriptor call, you can cache the descriptor itself but it will rarely help, and the builtin method cache probably already takes care of 99% of the cases where it would help, so I don't see what you're going to get here. >> On Thu, Dec 17, 2015 at 5:17 PM, Andrew Barnert wrote: >>> On Dec 17, 2015, at 13:37, Andrew Barnert via Python-Dev >>> wrote: >>> >>> On Thursday, December 17, 2015 11:19 AM, Franklin? Lee >>> wrote: >>> >>> ... as soon as I figure out how descriptors actually work... >>> >>> >>> I think you need to learn what LOAD_ATTR and the machinery around it >>> actually does before I can explain why trying to optimize it like >>> globals-vs.-builtins doesn't make sense. Maybe someone who's better at >>> explaining than me can come up with something clearer than the existing >>> documentation, but I can't. >> >> I take that back. First, it was harsher than I intended. Second, I think I >> can explain things. > > I appreciate it! Tracking function definitions in the source can make > one want to do something else. The documentation is pretty good for this stuff (and getting better every year). You mainly want the data model chapter of the reference and the descriptor howto guide; the dis and inspect docs in the library can also be helpful. Together they'll answer most of what you need. If they don't, maybe I will try to write up an explanation as a blog post, but I don't think it needs to get sent to the list (except for the benefit of core devs calling me out of I screw up, but they have better things to do with their time). >> First, for non-attribute lookups: >> >> (Non-shared) locals just load and save from an array. >> >> Free variables and shared locals load and save by going through an extra >> dereference on a cell object in an array. > > In retrospect, of course they do. > > It sounds like the idea is what's already used there, except the refs > are synced to the locals array instead of a hash table. Yes, which is already faster than what you want to do. More
Re: [Python-Dev] Idea: Dictionary references
On Fri, Dec 18, 2015 at 2:37 PM, Andrew Barnert via Python-Devwrote: > If __getattribute__ raises an AttributeError (or isn't found, but that only > happens in special cases like somehow calling a method on a type that hasn't > been constructed) Wow. How do you do that? Is it possible with pure Python? ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Idea: Dictionary references
2015-12-17 11:54 GMT+01:00 Franklin? Lee: > My suggestion should improve *all* function calls which refer to > outside names. Ok, I now think that you should stop hijacking the FAT Python thread. I start a new thread. IMHO your dictionary reference idea is completly unrelated to FAT Python. FAT Python is about guards and specialized bytecode. > Each function keeps an indirect, automagically updated > reference to the current value of the names they use, and will never > need to look things up again.[*] Indirections, nested dictionaries, creation of new "reference" objects... IMHO you are going to have major implementation issues :-/ The design looks *very* complex. I'm quite sure that you are going to make namespace lookups *slower*. It reminds me Python before the with statement and PyPy garbage collector. Many applications relied on the exact behaviour of CPython garbage collector. For example, they expected that a file is written on disk when the last reference to the file object is destroyed. In PyPy, it wasn't (it isn't) true, the write can be delayed. I guess that with all your complex machinery for dict lookups, you will have similar issues of object lifetime. It's unclear to me when and how "reference" objects are destroyed, nor when dict values are destroyed. What happens if a dict key is removed and a reference object is still alive? Is the dict value immediatly destroyed? Does the reference object contain a strong or a weak reference to the value? Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com