Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Josiah Carlson

Noam Raphael [EMAIL PROTECTED] wrote:
 On 10/31/05, Josiah Carlson [EMAIL PROTECTED] wrote:

   About the users-changing-my-internal-data issue:
 ...
  You can have a printout before it dies:
  I'm crashing your program because something attempted to modify a data
  structure (here's the traceback), and you were told not to.
 
  Then again, you can even raise an exception when people try to change
  the object, as imdict does, as tuples do, etc.
 
 Both solutions would solve the problem, but would require me to wrap
 the built-in set with something which doesn't allow changes. This is a
 lot of work - but it's quite similiar to what my solution would
 actually do, in a single built-in function.

I am an advocate for PEP 351.  However, I am against your proposed
implementation/variant of PEP 351 because I don't believe it ads enough
to warrant the additional complication and overhead necessary for every
object (even tuples would need to get a .frozen_cache member).

Give me a recursive freeze from PEP 351 (which handles objects that are
duplicated, but errors out on circular references), and I'll be happy.


   You suggest two ways for solving the problem. The first is by copying
   my mutable objects to immutable copies:
 
  And by caching those results, then invalidating them when they are
  updated by your application.  This is the same as what you would like to
  do, except that I do not rely on copy-on-write semantics, which aren't
  any faster than freeze+cache by your application.
 
 This isn't correct - freezing a set won't require a single copy to be
 performed, as long as the frozen copy isn't saved after the original
 is changed. Copy+cache always requires one copy.

You are wrong, and you even say you are wrong...freezing a set doesn't
require a COPY, IF the frozen COPY isn't saved after the original is
CHANGED. Creating an immutable set IS CREATING A COPY, so it ALSO
copies, and you admit as much, but then say the equivalent of copying
isn't copying because I say so.


  In any case, whether you choose to use freeze, or use a different API,
  this particular problem is solvable without copy-on-write semantics.
 
 Right. But I think that a significant simplification of the API is a
 nice bonus for my solution. And about those copy-on-write semantics -
 it should be proven how complex they are. Remember that we are talking
 about frozen-copy-on-write, which I think would simplify matters
 considerably - for example, there are at most two instances sharing
 the same data, since the frozen copy can be returned again and again.

I think that adding an additional attribute to literally every single
object to handle the caching of 'frozen' objects, as well as a list to
every object to handle callbacks which should be called on object
mutation, along with a _call_stuff_when_mutated() method that handles
these callback calls, IN ADDITION TO the __freeze__ method which is
necessary to support this, is a little much, AND IS CERTAINLY NOT A
SIMPLIFICATION!

Let us pause for a second and consider:
Original PEP proposed 1 new method: __freeze__, which could be
implemented as a subclass of the original object (now), and integrated
into the original classes as time goes on.  One could /register/
__freeze__ functions/methods a'la Pickle, at which point objects
wouldn't even need a native freeze method.

Your suggestion offers 2 new methods along with 2 new instance variables. 
Let's see, a callback handler, __freeze__, the cache, and the callback
list.  Doesn't that seem a little excessive to you to support freezing?
It does to me.  If Guido were to offer your implementation of freeze, or
no freeze at all, I would opt for no freeze, as implementing your freeze
on user-defined classes would be a pain in the ass, not to mention
implementing them in C code would be more than I would care to do, and
more than I would ask any of the core developers to work on.


  Even without validation, there are examples that force a high number of
  calls, which are not O(1), ammortized or otherwise.
 
 [Snap - a very interesting example]
 
  Now, the actual time analysis on repeated freezings and such gets ugly.
  There are actually O(k) objects, which take up O(k**2) space.  When you
  modify object b[i][j] (which has just been frozen), you get O(k)
  callbacks, and when you call freeze(b), it actually results in O(k**2)
  time to re-copy the O(k**2) pointers to the O(k) objects.  It should be
  obvious that this IS NOT AMMORTIZABLE to original object creation time.
 
 That's absolutely right. My ammortized analysis is correct only if you
 limit yourself to cases in which the original object doesn't change
 after a frozen() call was made. In that case, it's ok to count the
 O(k**2) copy with the O(k**2) object creation, because it's made only
 once.

But here's the crucial observation which you are missing.  You yourself
have stated that in both your table and graph examples you want your
application to continue to modify 

Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Noam Raphael
On 11/1/05, Josiah Carlson [EMAIL PROTECTED] wrote:
...

 I am an advocate for PEP 351.  However, I am against your proposed
 implementation/variant of PEP 351 because I don't believe it ads enough
 to warrant the additional complication and overhead necessary for every
 object (even tuples would need to get a .frozen_cache member).

 Give me a recursive freeze from PEP 351 (which handles objects that are
 duplicated, but errors out on circular references), and I'll be happy.

That's fine - but it doesn't mean that I must be happy with it.

...
 
  This isn't correct - freezing a set won't require a single copy to be
  performed, as long as the frozen copy isn't saved after the original
  is changed. Copy+cache always requires one copy.

 You are wrong, and you even say you are wrong...freezing a set doesn't
 require a COPY, IF the frozen COPY isn't saved after the original is
 CHANGED. Creating an immutable set IS CREATING A COPY, so it ALSO
 copies, and you admit as much, but then say the equivalent of copying
 isn't copying because I say so.

No, I am not wrong. I am just using misleading terms. I will call a
frozen copy a frozen image. Here it goes: freezing a set doesn't
require a COPY, IF the frozen IMAGE isn't saved after the original is
CHANGED. I suggest that there would be a way to create a frozenset
without COPYING an O(n) amount of MEMORY. When a frozen set is created
by a call frozen(x), it would not copy all the data, but would rather
reference the existing data, which was created by the non-frozen set.
Only if the original set changes, when there's a frozen set
referencing the data, the MEMORY would be actually copied.

I call it a frozen copy because it behaves as a frozen copy, even
though not all the memory is being copied. When you call the COPY
function in the COPY module with a string, it doesn't really copy
memory - the same string is returned. When you copy a file inside
subversion, it doesn't actually copy all the data associated with it,
but does something smarter, which takes O(1). The point is, for the
user, it's a copy. Whether or not memory is actually being copied, is
an implementation detail.

...

 I think that adding an additional attribute to literally every single
 object to handle the caching of 'frozen' objects, as well as a list to
 every object to handle callbacks which should be called on object
 mutation, along with a _call_stuff_when_mutated() method that handles
 these callback calls, IN ADDITION TO the __freeze__ method which is
 necessary to support this, is a little much, AND IS CERTAINLY NOT A
 SIMPLIFICATION!

I don't agree. You don't need to add a list to every object, since you
can store all those relations in one place, with a standard function
for registering them. Anyway, code written in Python (which is the
language we are discussing) WON'T BE COMPLICATED! The frozen
mechanism, along with two new protocols (__frozen__ and __changed__),
would be added automatically! The internal state of a class written in
Python can be automatically frozen, since it's basically a dict. Now
let's see if it's a simplification:

1. No Python code would have to be made more complicated because of the change.
2. There would be no need to find workarounds, like cStringIO, for the
fact that strings and tuples are immutable.
3. You would be able to put any kind of object into a set, or use it
as a dict key.
4. Classes (like the graph example) would be able to give users things
without having to make a choice between risking their users with
strange bugs, making a complicated interface, making very inefficient
methods, and writing complicated wrapper classes.

I will ask you: Is this a complication?
The answer is: it requires a significent change of the CPython
implementation. But about the Python language: it's definitely a
simplification.

 Let us pause for a second and consider:
 Original PEP proposed 1 new method: __freeze__, which could be
 implemented as a subclass of the original object (now), and integrated
 into the original classes as time goes on.  One could /register/
 __freeze__ functions/methods a'la Pickle, at which point objects
 wouldn't even need a native freeze method.

 Your suggestion offers 2 new methods along with 2 new instance variables.
 Let's see, a callback handler, __freeze__, the cache, and the callback
 list.  Doesn't that seem a little excessive to you to support freezing?
 It does to me.  If Guido were to offer your implementation of freeze, or
 no freeze at all, I would opt for no freeze, as implementing your freeze
 on user-defined classes would be a pain in the ass, not to mention
 implementing them in C code would be more than I would care to do, and
 more than I would ask any of the core developers to work on.

As I said above: this suggestion would certainly require more change
in the Python implementation than your suggestion. But the Python
language would gain a lot more. Implementing my frozen on user-defined
classes would not be a pain 

Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Josiah Carlson
 That's fine. I wish that you read my answer, think about it a little,
 and just tell me in a yes or a no if you still consider it dead. I
 think that I have answered all your questions, and I hope that at
 least others would be convinced by them, and that at the end my
 suggestion would be accepted.

I still consider it dead.
If the implementation is hard to explain, it's a bad idea.

Also, not all user-defined classes have a __dict__, and not all
user-defined classes can have arbitrary attributes added to them.

c class foo(object):
... __slots__ = ['lst']
... def __init__(self):
... self.lst = []
...
 a = foo()
 a.bar = 1
Traceback (most recent call last):
  File stdin, line 1, in ?
AttributeError: 'foo' object has no attribute 'bar'
 

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Noam Raphael
On 11/1/05, Josiah Carlson [EMAIL PROTECTED] wrote:
...

 I still consider it dead.
If the implementation is hard to explain, it's a bad idea.

It is sometimes true, but not always. It may mean two other things:
1. The one trying to explain is not talented enough.
2. The implementation is really not very simple. A hash table, used so
widely in Python, is really not a simple idea, and it's not that easy
to explain.

 Also, not all user-defined classes have a __dict__, and not all
 user-defined classes can have arbitrary attributes added to them.

 c class foo(object):
 ... __slots__ = ['lst']
 ... def __init__(self):
 ... self.lst = []
 ...
  a = foo()
  a.bar = 1
 Traceback (most recent call last):
  File stdin, line 1, in ?
 AttributeError: 'foo' object has no attribute 'bar'
 
It doesn't matter. It only means that the implementation would have to
make frozen copies also of __slots__ items, when freezing a
user-defined class.

I am afraid that this question proves that I didn't convey my idea to
you. If you like, please forgive my inability to explain it clearly,
and try again to understand my idea, by going over what I wrote again,
and thinking on it. You can also wait for the PEP that I intend to
write. And you can also forget about it, if you don't want to bother
with it - you've already helped a lot.

Noam
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com