Re: [Python-Dev] New methods for weakref.Weak*Dictionary types

2006-05-10 Thread Armin Rigo
Hi Tim,

On Mon, May 01, 2006 at 04:57:06PM -0400, Tim Peters wrote:
 
 # Return a list of weakrefs to all the objects in the collection.
 # Because a weak dict is used internally, iteration is dicey (the
 # underlying dict may change size during iteration, due to gc or
 # activity from other threads).

But then, isn't the real problem the fact that applications cannot
safely iterate over weak dicts?  This fact could be viewed as a bug, and
fixed without API changes.  For example, I can imagine returning to the
client an iterator that locks the dictionary.  Upon exhaustion, or via
the __del__ of the iterator, or even in the 'finally:' part of the
generator if that's how iteration is implemented, the dict is unlocked.
Here locking means that weakrefs going away during this time are not
eagerly removed from the dict; they will be removed only when the dict
is unlocked.


A bientot,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New methods for weakref.Weak*Dictionary types

2006-05-10 Thread Tim Peters
[Tim Peters]
 
 # Return a list of weakrefs to all the objects in the collection.
 # Because a weak dict is used internally, iteration is dicey (the
 # underlying dict may change size during iteration, due to gc or
 # activity from other threads).

[Armin Rigo]
 But then, isn't the real problem the fact that applications cannot
 safely iterate over weak dicts?

Well, in the presence of threads, iterating over any dict (weak or
not) may blow up:  while some thread is iterating over the dict, other
threads can change the size of the dict, and then the dict iterator
will blow up the next time it's invoked.  In the context from which
that comment was extracted, that's a potential problem (which is why
the comment mentions activity from other threads :-)).

 This fact could be viewed as a bug, and fixed without API changes.
  For example, I can imagine returning to the client an iterator that locks 
 the
 dictionary.  Upon exhaustion, or via the __del__ of the iterator, or even in 
 the
 'finally:' part of the generator if that's how iteration is implemented, the 
 dict is
 unlocked.

 Here locking means that weakrefs going away during this time are not
 eagerly removed from the dict; they will be removed only when the dict
 is unlocked.

That could remove one source of potential iteration surprises unique
to weak dicts, due to magical removal of dict entries (note that it
would probably need a thread-safe count of outstanding iterators, and
not unlock the dict until the count fell to 0).  Other threads could
still change the dict's size _seemingly_ by magic (from the dict
iterator's POV).  I don't know whether fixing magical weak-dict
removal without fixing seemingly magical weak-dict removal or
addition via other threads would be worth the bother.  Anyone burned
by either now has learned to avoid the iter{keys,values,items}()
methods.

Without more support in the dict implementation (and support that
would probably be difficult to add), the only thoroughly safe strategy
is to atomically materialize a hidden collection of the keys, values,
or items to be iterated over, and have the iterator march over those. 
In effect, apps do that themselves now by iterating over
.keys()/values()/items() instead of their .iterXYZ() versions.  Many
apps can get away without that, though, so there's value in keeping
the current obvious dict iterators.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] New methods for weakref.Weak*Dictionary types

2006-05-01 Thread Fred L. Drake, Jr.
I'd like to commit this for Python 2.5:

http://www.python.org/sf/1479988

The WeakKeyDictionary and WeakValueDictionary don't
provide any API to get just the weakrefs out, instead
of the usual mapping API. This can be desirable when
you want to get a list of everything without creating
new references to the underlying objects at that moment.

This patch adds methods to make the references
themselves accessible using the API, avoiding requiring
client code to have to depend on the implementation.
The WeakKeyDictionary gains the .iterkeyrefs() and
.keyrefs() methods, and the WeakValueDictionary gains
the .itervaluerefs() and .valuerefs() methods.

The patch includes tests and docs.


  -Fred

-- 
Fred L. Drake, Jr.   fdrake at acm.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New methods for weakref.Weak*Dictionary types

2006-05-01 Thread Tim Peters
[Fred L. Drake, Jr.]
 I'd like to commit this for Python 2.5:

 http://www.python.org/sf/1479988

 The WeakKeyDictionary and WeakValueDictionary don't
 provide any API to get just the weakrefs out, instead
 of the usual mapping API. This can be desirable when
 you want to get a list of everything without creating
 new references to the underlying objects at that moment.

 This patch adds methods to make the references
 themselves accessible using the API, avoiding requiring
 client code to have to depend on the implementation.
 The WeakKeyDictionary gains the .iterkeyrefs() and
 .keyrefs() methods, and the WeakValueDictionary gains
 the .itervaluerefs() and .valuerefs() methods.

 The patch includes tests and docs.

+1.  A real need for this is explained in ZODB's ZODB/util.py's
WeakSet class, which contains a WeakValueDictionary:


# Return a list of weakrefs to all the objects in the collection.
# Because a weak dict is used internally, iteration is dicey (the
# underlying dict may change size during iteration, due to gc or
# activity from other threads).  as_weakref_list() is safe.
#
# Something like this should really be a method of Python's weak dicts.
# If we invoke self.data.values() instead, we get back a list of live
# objects instead of weakrefs.  If gc occurs while this list is alive,
# all the objects move to an older generation (because they're strongly
# referenced by the list!).  They can't get collected then, until a
# less frequent collection of the older generation.  Before then, if we
# invoke self.data.values() again, they're still alive, and if gc occurs
# while that list is alive they're all moved to yet an older generation.
# And so on.  Stress tests showed that it was easy to get into a state
# where a WeakSet grows without bounds, despite that almost all its
# elements are actually trash.  By returning a list of weakrefs instead,
# we avoid that, although the decision to use weakrefs is now very
# visible to our clients.
def as_weakref_list(self):
# We're cheating by breaking into the internals of Python's
# WeakValueDictionary here (accessing its .data attribute).
return self.data.data.values()


As that implementation suggests, though, I'm not sure there's real
payback for the extra time taken in the patch's `valuerefs`
implementation to weed out weakrefs whose referents are already gone: 
the caller has to make this check anyway when it iterates over the
returned list of weakrefs.  Iterating inside the implementation, to
build the list via itervalues(), also creates that much more
vulnerability to dict changed size during iteration multi-threading
surprises.  For that last reason, if the patch went in as-is, I expect
ZODB would still need to cheat; obtaining the list of weakrefs
directly via plain .data.values() is atomic, and so immune to these
multi-threading surprises.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New methods for weakref.Weak*Dictionary types

2006-05-01 Thread Fred L. Drake, Jr.
On Monday 01 May 2006 16:57, Tim Peters wrote:
  +1.  A real need for this is explained in ZODB's ZODB/util.py's
  WeakSet class, which contains a WeakValueDictionary:
...
  As that implementation suggests, though, I'm not sure there's real
  payback for the extra time taken in the patch's `valuerefs`
  implementation to weed out weakrefs whose referents are already gone:
  the caller has to make this check anyway when it iterates over the

Good point; I've updated the patch accordingly.


  -Fred

-- 
Fred L. Drake, Jr.   fdrake at acm.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com