Serhiy Storchaka added the comment:
If you want to decrease the number of executed instructions, I suggest to
duplicate lookup functions (actually add boolean flag and switch one of two
branches). On read lookup function should ignore dummy entries, on modification
it should behave as now.
Serhiy Storchaka added the comment:
In some cases the slowdown is over 100x.
$ ./python -m timeit -s s = set(range(10**4)) -- s.discard(0); s.add(0)
Unpatched: 100 loops, best of 3: 1.68 usec per loop
Patched: 1 loops, best of 3: 183 usec per loop
--
Raymond Hettinger added the comment:
I'll just close this one. There seems to be little interest in it.
Originally, I characterized freeslot tracking as an optimization that mildly
benefited an uncommon case (adds/deletes/moreadds) at the expense of the common
cases (uniquification,
Changes by Raymond Hettinger raymond.hettin...@gmail.com:
--
Removed message: http://bugs.python.org/msg234210
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23259
___
Serhiy Storchaka added the comment:
This can be a common case in following algorithm (mesh optimization):
while elems:
elem = elems.pop()
changed = optimize(elem)
if changed:
elems.update(neighbors(elem))
--
nosy: +serhiy.storchaka
New submission from Raymond Hettinger:
The lookkey routines in Object/setobject.c have logic to track the first open
freeslot in a search.
The benefit is that new keys can reuse previously deleted slots. The benefit
only occurs in cases where keys are added, then some removed, and then
Raymond Hettinger added the comment:
Even in the mesh algorithm, we let resizing periodically clean-up the dummies.
The idea is to not pay the freeslot tracking cost on every lookup and instead
only clean-up periodically (which would likely give better performance for the
mesh algorithm as
Antoine Pitrou added the comment:
In short, it looks like the freeslot idea was a net negative -- it
optimized an uncommon case at the cost of slowing and complicating the
common cases.
Do you have a benchmark showing the slowing down?
--
nosy: +pitrou
Raymond Hettinger added the comment:
I've observed the generated assembly has fewer instructions on the critical
path and that a register was freed-up. That's enough for me in this case (it's
too easy to get trapped in local minimums in timing small changes like this).
Do either of you have
Antoine Pitrou added the comment:
Fewer instructions doesn't necessarily translate into better performance. The
bottleneck in superscalar, pipelined processors is often the data dependency
path between instructions. Adding instructions may as well not slow anything
down, if those instructions
10 matches
Mail list logo