Raymond Hettinger <raymond.hettin...@gmail.com> added the comment:

> In former examples (list, ordered dict) objects are iterated and deleted 
>in order of creation. But in the set of strings items are deleted in
> non-predicable random order. I suppose the non-linear effect is 
> related to memory management.

That makes sense.  We're likely seeing an artifact of a favorable 
correspondence between the container pointer order and the order that the 
referred-to objects were created in memory in this somewhat non-representative 
test case.  That is why lists lose the favorable timing when the data is 
shuffled.

If the test bench creates all the referred-to objects in consecutive memory 
locations, then any container accessing those objects consecutively will 
benefit from spatial cache locality.  (i.e. If there is more than one datum per 
cache line, the second read is virtually free.  Likewise, memory controller 
read-ahead makes a long series of consecutive accesses cheaper.)

It would be nice is the timing were run on strings instead of integers.  Ints 
are somewhat weird in that the hashes of consecutive integers are themselves 
consecutive, you can fit two ints in one cache line. 

I'm not sure whether it applies here, but there may also be a branch-prediction 
effect as well:  
https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array

FWIW, the deallocation routine for sets isn't doing anything special.  It just 
calls Py_DECREF in a loop:

static void
set_dealloc(PySetObject *so)
{
    setentry *entry;
    Py_ssize_t used = so->used;

    /* bpo-31095: UnTrack is needed before calling any callbacks */
    PyObject_GC_UnTrack(so);
    Py_TRASHCAN_SAFE_BEGIN(so)
    if (so->weakreflist != NULL)
        PyObject_ClearWeakRefs((PyObject *) so);

    for (entry = so->table; used > 0; entry++) {
        if (entry->key && entry->key != dummy) {
                used--;
                Py_DECREF(entry->key);
        }
    }
    if (so->table != so->smalltable)
        PyMem_DEL(so->table);
    Py_TYPE(so)->tp_free(so);
    Py_TRASHCAN_SAFE_END(so)
}

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32846>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to