[issue28183] Clean up and speed up dict iteration

2017-03-31 Thread Donald Stufft

Changes by Donald Stufft :


--
pull_requests: +852

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-10-09 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Few weeks I tried to rewrite loops in more fast way, but I have not found 
anything significantly faster or clearer. Some forms of was a few percent 
faster, but I would be difficult to explain this effect. Most likely it was an 
artifact of the compilation.

Removed unrelated changes from dict_iter8.patch and added additional check in 
_PyDict_Next() before committing.

--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-10-09 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 112714f3745d by Serhiy Storchaka in branch '3.6':
Issue #28183: Optimize and cleanup dict iteration.
https://hg.python.org/cpython/rev/112714f3745d

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-10-01 Thread INADA Naoki

INADA Naoki added the comment:

dict_iter8.patch is based on dict_iter3.patch.  Added some comments and fixing 
indents.
No change about _PyDict_Next API.

--
Added file: http://bugs.python.org/file44921/dict_iter8.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-30 Thread INADA Naoki

INADA Naoki added the comment:

Refactoring includes replacing PyDict_Next with _PyDict_Next.
It can improve performance by skipping some checks.
But it would be too small to see difference from benchmark.

I'll reduce diff size, hopefully in this weekend.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-29 Thread STINNER Victor

STINNER Victor added the comment:

Serhiy Storchaka added the comment:
> It looks to me that the patch contain unneeded changes. Some changes don't 
> affect performance, others don't look well justified. Are there benchmarks 
> that confirm the benefit of these changes?

Hum, since I expect disagreements on the changes, I propose to split
the patch into two parts: minimum change to optimize, and a second
change (later, once the first one is merged) to cleanup further the
code.

"Some changes don't affect performance"

Well, a cleanup is supposed to not affect performances at all :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It looks to me that the patch contain unneeded changes. Some changes don't 
affect performance, others don't look well justified. Are there benchmarks that 
confirm the benefit of these changes?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Please wait until I merge my changes.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-29 Thread STINNER Victor

STINNER Victor added the comment:

dict_iter7.patch LGTM, go ahead. Just dont forget to mention Serhiy Storchaka 
as the co-author ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-29 Thread INADA Naoki

Changes by INADA Naoki :


Added file: http://bugs.python.org/file44879/dict_iter7.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-29 Thread INADA Naoki

INADA Naoki added the comment:

recreate patch with different option, since Rietveld doesn't accept 
dict_iter5.patch

--
Added file: http://bugs.python.org/file44876/dict_iter6.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-28 Thread INADA Naoki

Changes by INADA Naoki :


Added file: http://bugs.python.org/file44874/dict_iter5.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-28 Thread STINNER Victor

STINNER Victor added the comment:

I reviewed dict_iter4.patch on Rietveld.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-28 Thread INADA Naoki

INADA Naoki added the comment:

Draft Misc/NEWS entry (and commit message) is:

 Core and Builtins
 -

+- Issue #28183: Optimize dict iteration.
+

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-28 Thread INADA Naoki

Changes by INADA Naoki :


--
versions: +Python 3.7
Added file: http://bugs.python.org/file44864/dict_iter4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-28 Thread INADA Naoki

INADA Naoki added the comment:

Updated with small refactoring.
Confirmed passes quicktest and no warning from clang on OS X.

--
Added file: http://bugs.python.org/file44857/dict_iter3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-27 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Feel free to do this. I'm not such busy, I'm experimenting with other variants.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-27 Thread INADA Naoki

INADA Naoki added the comment:

Serhiy, may I update your patch, if you're busy in this week?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Moved dictresize() discussion to http://bugs.python.org/issue28199

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Raymond, I think this is different issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Here is a rough (very rough and not compileable) sketch of the direction that 
resizing should take:

static void
insert_indices_clean(PyDictObject *mp, Py_hash_t hash)
{
size_t i, perturb;
PyDictKeysObject *k = mp->ma_keys;
size_t mask = (size_t)DK_SIZE(k)-1;

i = hash & mask;
for (perturb = hash; dk_get_index(k, i) != DKIX_EMPTY;
 perturb >>= PERTURB_SHIFT) {
i = mask & ((i << 2) + i + perturb + 1);
}
dk_set_index(k, i, k->dk_nentries);
}

static int
dictresize(PyDictObject *mp, Py_ssize_t minused)
{
Py_ssize_t i, newsize;
PyDictKeyEntry *ep0;

/* Find the smallest table size > minused. */
for (newsize = PyDict_MINSIZE;
 newsize <= minused && newsize > 0;
 newsize <<= 1)
;
if (newsize <= 0) {
PyErr_NoMemory();
return -1;
}

realloc(dk->dk_indicies, es * newsize);
memset(>dk_indices.as_1[0], 0xff, es * size);
dk->dk_size = size;

for (i = 0; i < mp->dk_nentries; i++) {
PyDictKeyEntry *ep = [i];
if (ep->me_value != NULL) {
insert_indices_clean(mp, ep->me_hash);
}
}
return 0;
}

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Actually most of optimization is not specific for new dict implementation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

We can still clean this up for Python 3.6.  We're in feature freeze, not 
development freeze.  The compact dict patch was very rough when it landed and 
is expected to continue to be polished so that the expected benefits are 
realized.

--
versions: +Python 3.6 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
assignee:  -> serhiy.storchaka
versions:  -Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

> Raymond: With such implementation keys, values and hashes are all
> organized together and there seems no _resize operation can only 
> adjust hashes without breaking the entire layout.

None of those needs to change during a resize.  Only indices array needs to be 
rebuilt.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread INADA Naoki

INADA Naoki added the comment:

LGTM, thanks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-18 Thread Xiang Zhang

Xiang Zhang added the comment:

Serhiy: Patch LGTM except two trivial comments on Rietveld.

Raymond: With such implementation keys, values and hashes are all organized 
together and there seems no _resize operation can only adjust hashes without 
breaking the entire layout.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Also, please take a look at resizing.  It looks like it is doing way too much 
work.  The original keys, values, and hashes shouldn't move at all, only the 
indices array needs to updated.

Here is the pure python version from the original proof-of-concept at 
https://code.activestate.com/recipes/578375

def _resize(self, n):
'''Reindex the existing hash/key/value entries.
   Entries do not get moved, they only get new indices.
   No calls are made to hash() or __eq__().

'''
n = 2 ** n.bit_length() # round-up to power-of-two
self.indices = self._make_index(n)
for index, hashvalue in enumerate(self.hashlist):
for i in Dict._gen_probes(hashvalue, n-1):
if self.indices[i] == FREE:
break
self.indices[i] = index
self.filled = self.used

Likewise, it looks like there is room for improvement in dict_copy().  It can 
memcpy() all four arrays and then incref all the keys.  That should be 
considerably faster than zeroing new arrays and applying re-insertion logic.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Yes, I missed seeing that.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Dict iterators already have __length_hint__. Dict views have __len__.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-17 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Also consider adding __length_hint__ to the various mapping views.  That would 
help with a common case of listing the view.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Good catch Xiang Zhang! I missed this point. Here is fixed patch.

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)"
Python 3.5:Median +- std dev: 33.8 ms +- 0.7 ms
Python 3.6 unpatched:  Median +- std dev: 37.8 ms +- 0.5 ms
Python 3.6 patched:Median +- std dev: 34.0 ms +- 0.6 ms

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6)); v = d.values()" 
-- "list(v)"
Python 3.5:Median +- std dev: 26.2 ms +- 0.7 ms
Python 3.6 unpatched:  Median +- std dev: 28.0 ms +- 0.6 ms
Python 3.6 patched:Median +- std dev: 26.1 ms +- 0.8 ms

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6)); v = d.items()" 
-- "list(v)"
Python 3.5:Median +- std dev: 232 ms +- 6 ms
Python 3.6 unpatched:  Median +- std dev: 259 ms +- 6 ms
Python 3.6 patched:Median +- std dev: 249 ms +- 6 ms

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "set(d)"
Python 3.5:Median +- std dev: 68.3 ms +- 1.8 ms
Python 3.6 unpatched:  Median +- std dev: 68.1 ms +- 2.5 ms
Python 3.6 patched:Median +- std dev: 63.7 ms +- 1.5 ms

$ ./python -m perf timeit -s "from _testcapi import test_dict_iteration as t" 
-- "t()"
Python 3.5:Median +- std dev: 3.31 ms +- 0.10 ms
Python 3.6 unpatched:  Median +- std dev: 3.51 ms +- 0.09 ms
Python 3.6 patched:Median +- std dev: 3.43 ms +- 0.05 ms

--
Added file: http://bugs.python.org/file44719/dict_iter2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-17 Thread Xiang Zhang

Xiang Zhang added the comment:

Serhiy, I don't understand your patch. You delete the logic about split table 
and then iterating split table actually fails.

>>> class C:
... pass
... 
>>> a, b = C(), C()
>>> a.a, a.b = 1, 2
>>> list(b.__dict__)
['a', 'b']
>>> 

Without attributes now b also get entries when iterating.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-16 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +methane, xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28183] Clean up and speed up dict iteration

2016-09-16 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

In some circumstances iterating dict under 3.6 can be 20% slower than under 3.5.

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "list(d)"

Python 3.5: Median +- std dev: 33.8 ms +- 0.7 ms
Python 3.6: Median +- std dev: 37.8 ms +- 0.5 ms

Seems this is compiler and platform specific, it is reproducible only with GCC 
on 32 bit.
 
Proposed patch restores 3.5 performance and simplifies the code.

Python 3.6 patched: Median +- std dev: 33.7 ms +- 0.7 ms

Other types of iteration:

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6)); v = d.values()" 
-- "list(v)"
Python 3.5:Median +- std dev: 26.2 ms +- 0.7 ms
Python 3.6 unpatched:  Median +- std dev: 28.0 ms +- 0.6 ms
Python 3.6 patched:Median +- std dev: 26.3 ms +- 1.1 ms

$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6)); v = d.items()" 
-- "list(v)"
Python 3.5:Median +- std dev: 232 ms +- 6 ms
Python 3.6 unpatched:  Median +- std dev: 259 ms +- 6 ms
Python 3.6 patched:Median +- std dev: 243 ms +- 9 ms

_PyDict_Next():
$ ./python -m perf timeit -s "d = dict.fromkeys(range(10**6))" -- "set(d)"
Python 3.5:Median +- std dev: 68.3 ms +- 1.8 ms
Python 3.6 unpatched:  Median +- std dev: 68.1 ms +- 2.5 ms
Python 3.6 patched:Median +- std dev: 66.0 ms +- 1.2 ms

PyDict_Next():
$ ./python -m perf timeit -s "from _testcapi import test_dict_iteration as t" 
-- "t()"
Python 3.5:Median +- std dev: 3.31 ms +- 0.10 ms
Python 3.6 unpatched:  Median +- std dev: 3.51 ms +- 0.09 ms
Python 3.6 patched:Median +- std dev: 3.43 ms +- 0.09 ms

--
components: Interpreter Core
files: dict_iter.patch
keywords: patch
messages: 276755
nosy: haypo, ned.deily, rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Clean up and speed up dict iteration
type: performance
versions: Python 3.6, Python 3.7
Added file: http://bugs.python.org/file44697/dict_iter.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com