Raymond Hettinger added the comment:
May be add the unicode specialization right in PyObject_RichCompareBool?
That's a possibility but it has much broader ramifications and might or might
not be the right thing to do. I'll leave that for someone else to pursue and
keep my sights on sets for
Changes by Raymond Hettinger raymond.hettin...@gmail.com:
--
resolution: - fixed
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
Roundup Robot added the comment:
New changeset 20f54cdf351d by Raymond Hettinger in branch 'default':
Issue #23119: Simplify setobject by inlining the special case for unicode
equality testing.
https://hg.python.org/cpython/rev/20f54cdf351d
--
nosy: +python-dev
Serhiy Storchaka added the comment:
May be add the unicode specialization right in PyObject_RichCompareBool?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
Raymond Hettinger added the comment:
Thanks Mark. I appreciate the effort looking at use cases.
I'm reopening this one because I have an alternative patch that simplifies the
code but keeps the unicode specialization. It replaces the lookkey indirection
with a fast and predictable inline
Marc-Andre Lemburg added the comment:
On 09.01.2015 09:33, Raymond Hettinger wrote:
I'm withdrawing this one. After more work trying many timings on multiple
compilers and various sizes and kinds of datasets, it appears that the
unicode specialization is still worth it.
The cost of
Raymond Hettinger added the comment:
I'm withdrawing this one. After more work trying many timings on multiple
compilers and various sizes and kinds of datasets, it appears that the unicode
specialization is still worth it.
The cost of the lookup indirection appears to be completely
Serhiy Storchaka added the comment:
+1 for removing unicode specialization. Dictionaries with string keys is a part
of the language, but sets of strings are not special.
--
nosy: +serhiy.storchaka
___
Python tracker rep...@bugs.python.org
Marc-Andre Lemburg added the comment:
On 08.01.2015 15:46, Serhiy Storchaka wrote:
Sets of strings are very common when trying to create a unique set of
strings or optimizing name in set_of_names lookups.
This is not nearly so common as attributes or globals access, or passing
keyword
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
___
Marc-Andre Lemburg added the comment:
I'm not sure I follow:
Sets of strings are very common when trying to create a unique set of strings
or optimizing name in set_of_names lookups.
Regarding your benchmark numbers: I have a hard time following how they work. A
simply word in
Serhiy Storchaka added the comment:
Sets of strings are very common when trying to create a unique set of strings
or optimizing name in set_of_names lookups.
This is not nearly so common as attributes or globals access, or passing
keyword arguments.
--
Ezio Melotti added the comment:
Without changesets information (not included in the git format) rietveld will
try to apply the patch on default and if it applies clearly it will work, so
creating the patch against an up to date py3 clone should work even with the
git format.
--
STINNER Victor added the comment:
@Raymond: Please disable git format for patches, because Rietveld doesn't
support such patch and so we don't get the review button.
--
nosy: +haypo
___
Python tracker rep...@bugs.python.org
Raymond Hettinger added the comment:
Attaching an alternative patch that handles the unicode specific case with far
less code and less overhead. It seems to speed-up all the timings I've tried.
It keeps the unicode_eq() specific path which bypasses several unneeded steps:
* an incref/decref
Raymond Hettinger added the comment:
Timings for no_special_hash.diff:
$ ~/cpython/python.exe -m timeit -r7 -s 's={html}' 'html in s'
1000 loops, best of 7: 0.0315 usec per loop
$ ~/nounicode/python.exe -m timeit -r7 -s 's={html}' 'html in s'
1000 loops, best of 7: 0.0336 usec per loop
Changes by Raymond Hettinger raymond.hettin...@gmail.com:
Added file: http://bugs.python.org/file37557/measure_build_set.py
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
Changes by Raymond Hettinger raymond.hettin...@gmail.com:
Added file: http://bugs.python.org/file37558/build_set_timings.txt
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
New submission from Raymond Hettinger:
This tracker item is to record experiments with removing unicode specialization
code from set objects and run timings to determine the performance benefits or
losses from those specializations.
* Removes the set_lookkey_unicode() function and the
Changes by Raymond Hettinger raymond.hettin...@gmail.com:
Added file: http://bugs.python.org/file37548/no_special_hash.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
Changes by Raymond Hettinger raymond.hettin...@gmail.com:
Added file: http://bugs.python.org/file37549/time_suite.sh
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23119
___
Antoine Pitrou added the comment:
+1 for this. In my experience sets of unicode keys are not as common as dicts
with unicode keys, and the posted numbers make the simplification a no-brainer.
--
nosy: +pitrou
___
Python tracker
22 matches
Mail list logo