[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-10 Thread Christian Heimes
Christian Heimes added the comment: I've done as you said and committed the changes in r59449. Next time I won't try to add optimizations without consulting you in the first place. :] Thanks for your advice. -- resolution: -> fixed status: open -> closed __

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-10 Thread Raymond Hettinger
Raymond Hettinger added the comment: I had looked at the version 2 instead of version 3. Version 3 is much closer. A couple of comments. Don't change the brace opening/closing convention in the file -- stick with the K&R style -- mixing two different styles makes the file harder to read and

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Christian Heimes added the comment: The latest patch does *NOT* add new macros, functions or other stuff. I simply replaced PyString_* with PyUnicode_* in setobject.c where appropriate. The only function I had to factor out is unicode_eq(). It's now in a new file stringlib/eq.h which is included

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Raymond Hettinger
Raymond Hettinger added the comment: Okay, then simply go back to the unaltered existing code and replace references to PyString with PyUnicode. Skip all the macros, new functions, factorings, etc. Just change strings to unicode and be done with it. IIRC, that was what was done for dicts. J

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Changes by Christian Heimes: Added file: http://bugs.python.org/file8905/py3k_optimize_set_unicode3.patch __ Tracker <[EMAIL PROTECTED]> __ ___ Python-b

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Christian Heimes added the comment: Raymond Hettinger wrote: > Which is the common case in Py3k, to have strings or unicode? By > trying to catch both, you slow down the optimization. Also, the > new "hash_fast" introduces function call overhead in previously in- > lined code. PyUnicode is t

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Raymond Hettinger
Raymond Hettinger added the comment: Which is the common case in Py3k, to have strings or unicode? By trying to catch both, you slow down the optimization. Also, the new "hash_fast" introduces function call overhead in previously in- lined code. My preference is to knock-out the optimization

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Changes by Christian Heimes: Removed file: http://bugs.python.org/file8899/py3k_optimize_set_unicode.patch __ Tracker <[EMAIL PROTECTED]> __ ___ Python-

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Christian Heimes added the comment: Updates: * Moved dictobject.c:unicode_eq() to unicodeobject.c:_PyUnicode_Eq() * Added another optimization step to _PyUnicode_Eq(). The hash is required later anyway and comparing two hashes is much faster than memcmp-ing the unicode objects. if (unicode_h

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Raymond Hettinger
Raymond Hettinger added the comment: The patch doesn't parallel what was done for dicts. The code in dictobject.c does not use a macro. It special cases for PyUnicode but not PyString. Please submit a patch that mirrors what was done for dicts. __ Tracker <[

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Christian Heimes added the comment: I fixed a bug in the last patch. It now works mixed sets with str and bytes but it doesn't optimize bytes in set_lookup() any more. Added file: http://bugs.python.org/file8899/py3k_optimize_set_unicode.patch __ Tracker <[EMAIL

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-09 Thread Christian Heimes
Changes by Christian Heimes: Removed file: http://bugs.python.org/file8896/py3k_optimize_set_unicode.patch __ Tracker <[EMAIL PROTECTED]> __ ___ Python-

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-08 Thread Christian Heimes
Christian Heimes added the comment: I've created a patch that adds optimization for PyUnicode while keeping the existing optimization for PyString. The patch moves the optimization trick for PyObject_Hash() into a macro and adds an optimized _PyUnicode_Eq() to unicodeobject.c -- assignee

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-06 Thread Raymond Hettinger
Raymond Hettinger added the comment: P.S. There is a way to factor-out some common code but it would entail introducing a bunch of macros that hide the differences between the two. I don't think it would be worth it. __ Tracker <[EMAIL PROTECTED]>

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-06 Thread Raymond Hettinger
Raymond Hettinger added the comment: Not really, the implementations are different enough that it would be *really* hard to keep common code. The two parallel each other in a way that is visually easy to translate but hard to do through real refactoring. For the most part, both code bases have

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-06 Thread Guido van Rossum
Guido van Rossum added the comment: Would it make any sense at all to refactor the code so that code reuse is automatic? __ Tracker <[EMAIL PROTECTED]> __ __

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-06 Thread Raymond Hettinger
New submission from Raymond Hettinger: Much of the code in setobject.c exactly parallels that in dictobject.c. Ideally, we should keep that parallelism by scanning all of the changes to dictobject.c and applying substantially similar changes to setobject.c (just the changes that touch the hash t

[issue1564] The set implementation should special-case PyUnicode instead of PyString

2007-12-06 Thread Guido van Rossum
Changes by Guido van Rossum: -- keywords: py3k nosy: gvanrossum severity: normal status: open title: The set implementation should special-case PyUnicode instead of PyString versions: Python 3.0 __ Tracker <[EMAIL PROTECTED]>