Hello, It seems that we both agree that freezing is cool (-; . We disagree on whether a copy-on-write behaviour is desired. Your arguments agains copy-on-write are: 1. It's not needed. 2. It's complicated to implement.
But first of all, you didn't like my use cases. I want to argue with that. > In reading your description of a 'table of values', I can't help but be > reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It > offers arbitrary tables of values (whose editors and viewers you can > change at will), which offers a mechanism by which you can "listen" to > changes that occur to the contents of a cell. I can't help but think > that if you offered a protocol by which a user can signal that a cell > has been changed, perhaps by writing the value to the table itself > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 > freeze), etc., that both you and the users of your table would be much > happier. Perhaps I didn't make it clear. The difference between wxPython's Grid and my table is that in the table, most values are *computed*. This means that there's no point in changing the values themselves. They are also used frequently as set members (I can describe why, but it's a bit complicated.) I want to say that even if sets weren't used, the objects in the table should have been frozen. The fact the sets (and dicts) only allow immutable objects as members/keys is just for protecting the user. They could have declared, "you shouldn't change anything you insert - as long as you don't, we'll function properly." The only reason why you can't compute hash values of mutable objects is that you don't want your user to make mistakes, and make the data structure inconsistent. > As for the graph issue, you've got a bigger problem than users just > being able to edit edge lists, users can clear the entire dictionary of > vertices (outgoing.clear()). It seems to me that a more reasonable > method to handle this particular case is to tell your users "don't > modify the dictionaries or the edge lists", and/or store your edge lists > as tuples instead of lists or dictionaries, and/or use an immutable > dictionary (as offered by Barry in the PEP). As I wrote before, telling my users "don't modify the edge lists" is just like making lists hashable, and telling all Python users, "dont modify lists that are dictionary keys." There's no way to tell the users that - there's no convention for objects which should not be changed. You can write it in the documentation, but who'll bother looking there? I don't think that your other suggestions will work: the data structure of the graph itself can't be made of immutable objects, because of the fact that the graph is a mutable object - you can change it. It can be made of immutable objects, but this means copying all the data every time the graph changes. Now, about copy-on-write: > There's also this little issue of "copy on write" semantics with Python. > Anyone who tells you that "copy on write" is easy, is probably hanging > out with the same kind of people who say that "threading is easy". Of > course both are easy if you limit your uses to some small subset of > interesting interactions, but "copy on write" gets far harder when you > start thinking of dictionaries, lists, StringIOs, arrays, and all the > possible user-defined classes, which may be mutated beyond obj[key] = > value and/or obj.attr = value (some have obj.method() which mutates the > object). As such, offering a callback mechanism similar to weak > references is probably pretty close to impossible with CPython. Let's limit ourselves to copy-on-write of objects which do not contain nonfrozen objects. Perhaps it's enough - the table, the graph, and strings, are perfect examples of these. Implementation doesn't seem to complicated to me - whenever the object is about to change, and there is a connected frozen copy, you make a shallow copy of the object, point the frozen copy to it, release the reference to the frozen copy, and continue as usual. That's all. I really think that this kind of copy-on-write is "correct". The temporary freezing of sets in order to check if they are members of other sets is a not-very-nice way of implementing it. This kind of copy-on-write would allow, in principle, for Python strings to become mutable, with almost no speed penalty. It would allow my table, and other containers, to automatically freeze the objects that get into it, without having to trust the user on not changing the objects - and remember that there's no way of *telling* him not to change the objects. Now, the computer scientist in me wants to explain (and think about) freezing containers of nonfrozen objects. What I actually want is that as long as an object doesn't change after it's freezed, the cost of freezing would be nothing - that is, O(1). Think about a mutable string object, which is used in the same way as the current, immutable strings. It is constructed once, and then may be used as a key in a dictionary many times. I want to claim that it's a common pattern - create an object, it doesn't matter how, and then use it without changing it. If that is the case, it's obvious that all the frozen() calls would take O(1) each. How can we accomplish this (freezing costs O(1) as long as the object doesn't change) with containers of nonfrozen objects? It seems impossible - no matter what, on the first time the container is freezed, you would have to call frozen() for every object it contains! The answer is that in an amortized analysis, it is still an O(1) operation. The reason is that as long as frozen() takes O(1) (amortized), all those calls to frozen() can be considered a part of the object construction, since they are made only once - on the next call to frozen(), the already-created frozen object would be returned. This analysis is correct as long as the object doesn't change after it's freezed. The problem is that we have to keep the created frozen object as long as the original object stays alive. So we have to know if it has changed. This is where those callbacks get in. As long as what is done with them is correct, there should be no problems. They are used only to disengage the frozen copies from their original objects. The action they should trigger is simply that: def on_contained_object_change(self): self._frozen_copy = None while self._callbacks: self._callbacks.pop()() What's also interesting is that this freezing mechanism can be provided automatically for user-created classes, since those are simply containers of other objects, which behave exactly like dicts, for this matter. It allows everything in Python to be both mutable and hashable, without changing the O() complexity! Wow! Ok, I'm going to sleep now. If you find something wrong with this idea, please tell me. Have a good day, Noam _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com