Tim Peters <t...@python.org> added the comment: > Surprisingly, deleting a very large set takes much longer than creating it.
Luis, that's not surprising ;-) When you create it, it's mostly the case that there's a vast chunk of raw memory from which many pieces are passed out in address order (to hold all the newly created Python objects). Memory access is thus mostly sequential. But when you delete it, that vast chunk of once-raw memory is visited in essentially random order (string hashes impose a pseudo-random order on where (pointers to) string objects are stored within a set's vector), defeating all the hardware features that greatly benefit from sequential access. More precisely, the set's internal vector is visited sequentially during deletion, but the string objects the pointers point _at_ are all over the place. Even if nothing is swapped to disk, it's likely that visiting a string object during deletion will miss on all cache levels, falling back to (much slower) RAM. Note that all the string objects must be visited during set deletion, in order to decrement their reference counts. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32846> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com