[issue22084] Mutating while iterating
New submission from Aaron Brady: Hi, I asked about the inconsistency of the RuntimeError being raised when mutating a container while iterating over it here [1], set and dict iteration on Aug 16, 2012. [1] http://www.gossamer-threads.com/lists/python/python/1004659 I posted a patch on the ML but never submitted it. People's reaction seemed ambivalent. Now I have an idea for a different implementation. I'd like to take another shot at it. It's one of the worst silent errors, since there's an error in the *iterator* when we call a *set* method. We're going to add something to make it safer, at least in the sense of getting a clear failure, if the programmer does something that's always been ill-advised. We have a number of options for the implementation. We still have the option to introduce IterationError, possibly a subclass of RuntimeError. These options are still applicable: 1) Collection of iterators . Invalidate all open iterators on mutating . a) Linked list .. i) Uncounted references .. ii) Counted references .. iii) Weak references . b) Weak set 2) Version index / timestamp / memo . Iterators check whether the container has been mutated since they were created . a) No overflow - Python longs .. i) Reset index if no iterators left . b) Overflow - C ints / shorts (silent error) 3) Iterator count . Raise exception on mutation, not iteration The new option is: 2d) Use a dedicated empty *object* for a timestamp or memo. A new memo is created on every mutation. Before advancing, the iterator checks whether the current memo is a different object than it was when it was created. Costs: The existing silent error is fairly rare. The container gains a pointer to its current memo. The iterator loses the cached length but gains a pointer to a memo. The memos are blank objects: a Py ssize t and a pointer with certain flags at time of writing. Speed is the same: comparing the lengths is replaced with comparing the memos. Some caveats: The memory manager is used to obtain perpetually unique IDs. A unique algorithm could be used instead of the memory manager, though the memo needs to contain a reference count more or less regardless. There can at most be one memo per iterator. The approach is outlined in pseudocode here [2]. Implementation could be optimized slightly by only creating new memos if iterators have been opened, shown here [3]. [2] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%20markup.html [3] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%202%20markup.html -- components: Library (Lib) messages: 224071 nosy: castironpi priority: normal severity: normal status: open title: Mutating while iterating type: behavior versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22084 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22084] Mutating while iterating
Serhiy Storchaka added the comment: It would be better discuss such ideas on python-ideas mailing list (http://mail.python.org/mailman/listinfo/python-ideas). Option 3 breaks existing code such as for k, v in d.items(): if pred(k, v): d[k] = newvalue break Option 1 is memory inefficient. It requires a list of iterators in every dict (well, in almost every dict). And it doesn't look more time efficient than option 2. Implementation of option 2 was provided and rejected in issue19332. -- nosy: +rhettinger, serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22084 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22084] Mutating while iterating
Nick Coghlan added the comment: Raymond's answer at http://bugs.python.org/issue19332#msg202287 still stands: the checks for mutation while iterating are there primarily to *protect the interpreter itself*, rather than to help detect bugs where the user code misses some items because it mutated the dict during iteration. The fact it helps detect user errors is a helpful side effect of the interpreter protecting itself, rather than the *purpose* of the change. Mutations that don't change the mapping size don't do the interpreter any harm, even if they're not what the user intended: d = dict(a=1, b=2, c=3) for k, v in d.items(): ... del d[k] ... d[v] = k ... d {1: 'a', 2: 'b', 3: 'c'} for k, v in d.items(): ... del d[k] ... d[v] = k ... d {'a': 1, 'b': 2, 'c': 3} There are *lots* of ways to write buggy code that are significantly less obscure than this, and we don't change the way core data types work to prevent them. For example, list iterators don't care if the underlying list is mutated at all, as again, such mutation poses no threat to the interpreter: seq = [x for x in range(10)] for i, x in enumerate(seq): ... print(i, x) ... del seq[i] ... 0 0 1 2 2 4 3 6 4 8 This kind of quirky behaviour is why be careful when mutating containers you're iterating over is sound advice. It *isn't* a reason to change the behaviour of the language to make currently legal (albeit odd) code a runtime error. -- nosy: +ncoghlan resolution: - wont fix stage: - resolved status: open - closed versions: -Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22084 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com