On Monday, August 27, 2012 2:17:45 PM UTC-5, Ian wrote:
> On Thu, Aug 23, 2012 at 10:49 AM, Aaron Brady <[email protected]> wrote:
>
> > The patch for the above is only 40-60 lines. However it introduces two new
> > concepts.
>
>
>
> Is there a link to the patch?
Please see below. It grew somewhat during development.
> > The first is a "linked list".
SNIP.
> > The second is "uncounted references". The uncounted references are
> > references to "set iterators" exclusively, exist only internally to "set"
> > objects, and are invisible to the rest of the program. The reason for the
> > exception is that iterators are unique in the Python Data Model; iterators
> > consist of a single immutable reference, unlike both immutable types such
> > as strings and numbers, as well as container types. Counted references
> > could be used instead, but would be consistently wasted work for the
> > garbage collector, though the benefit to programmers' peace of mind could
> > be significant.
>
> >
>
> > Please share your opinion! Do you agree that the internal list resolves
> > the inconsistency? Do you agree with the strategy? Do you agree that
> > uncounted references are justified to introduce, or are counted references
> > preferable?
>
>
>
> This feature is a hard sell as it is; I think that adding uncounted
>
> references into the mix is only going to make that worse. May I
>
> suggest an alternate approach? Internally tag each set or dict with a
>
> "version", which is just a C int. Every time the hash table is
>
> modified, increment the version. When an iterator is created, store
>
> the current version on the iterator. When the iterator is advanced,
>
> check that the iterator version matches the dict/set version. If
>
> they're not equal, raise an error.
>
>
>
> This should add less overhead than the linked list without any
>
> concerns about reference counting. It does introduce a small bug in
>
> that an error condition could be "missed", if the version is
>
> incremented a multiple of 2**32 or 2**64 times between iterations --
>
> but how often is that really likely to occur? Bearing in mind that
>
> this error is meant for debugging and not production error handling,
>
> you could even make the version a single byte and I'd still be fine
>
> with that.
>
>
>
> Cheers,
>
> Ian
Hi Ian,
We could use a Python long object for the version index to prevent overflow.
Combined with P. Rubin's idea to count the number of open iterators, most use
cases still wouldn't exceed a single word comparison; we could reset the
counter when there weren't any. Using the linked list collection, modification
operations are expensive in rare cases. Using the version index, iteration is
expensive in rare cases.
I was more interested in the linked list for conceptual reasons, so I developed
it further. Changelog, diff file, test suite, and links are below. The devs
should be aware that a competing patch might be developed. I would be pleased
to hear what everybody thinks of it!
Linked list with uncounted references implementation for Win32.
Added:
- 'set_clear_ex' and 'set_clear_internal_ex' methods, differ in invalidation
and conditional invalidation behavior and return type.. The 'set.clear()'
method and 'tp_clear' type field both called the same method.
- 'set_invalidate_iter_linked' method. Iterate over the iterators of a set,
mark them invalid, and clear the list.
- 'setiter_unlink_internal' method. Remove the iterator from the set's linked
list of iterators.
- 'IterationError', global.
- New fields:
-- PySetObject: setiterobject *iter_linked. Pointer to the first element of
the linked list of the iterators of the set.
-- setiterobject: setiterobject *linked_pred, *linked_succ. Predecessor and
successor nodes in the linked list of iterators of the same set.
-- setiterobject: char ob_valid. Validation status of the iterator.
- Result is compared with original in 'set_intersection_update' and '_multi' to
determine whether to invalidate the list of iterators. Asymptotic running time
is unchanged.
- Pending: add 'tp_clear' field to 'PySetIter_Type'?
- Test script included, 'large numbers' test pending.
6 files changed: { setobject.h, setobject.c, exceptions.c, pyerrors.h,
python3.def, python33stub.def }. Test script 'set_iterator_test.py' new.
Linked list interface and pseudocode 'patch_pseudocode.txt'.
Zip file:
http://home.comcast.net/~castironpi-misc/clpy-0062-set_iterator_patch.zip
Diff file of 3.3.0b2:
http://home.comcast.net/~castironpi-misc/clpy-0062-set_iterator_diff.txt
--
http://mail.python.org/mailman/listinfo/python-list