[issue16286] Optimize a==b and a!=b for bytes and str
STINNER Victor added the comment: P.S. I rather like the optimization and don't want to discourage it. I'm just curious about what the current optimizations are missing. I'm too lazy to produce more statistics or run other benchmarks. I just saw an interesting optimization oportunity. I don't understand why it was not done before. If you consider that compare_hash.patch might slow down Python, so ok, I will just close the issue as rejected. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Changes by Meador Inge mead...@gmail.com: -- nosy: +meador.inge ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti stage: - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
STINNER Victor added the comment: Oh, I forgot this issue when I did the following commit: -- changeset: 79902:b68be1025c42 user:Victor Stinner victor.stin...@gmail.com date:Tue Oct 23 02:48:49 2012 +0200 files: Objects/unicodeobject.c description: Optimize PyUnicode_RichCompare() for Py_EQ and Py_NE: always use memcmp() -- I will benchmark the overhead of memcmp() on short strings. We may check the first and last characters before calling memcmp() to limit the overhead of calling a function. I also read that GCC uses its builtin memcmp() which is slower than the memcmp() of the GNU libc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Raymond Hettinger added the comment: Rather than see statistics, I'm curious about what circumstances where the optimization would kick in. Interned strings are pre-hashed but they already benefit from an identity-implies-equality check. Dicts and sets already incorporate a check-hash-before-equality check. That raises the question of what strings ever have had their hash already computed if the string hasn't been interned or has been used in a dict or set? P.S. I rather like the optimization and don't want to discourage it. I'm just curious about what the current optimizations are missing. -- nosy: +rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Gregory P. Smith added the comment: something to include in your statistics is the lengths of the already hashed data being compared. i expect there to be a minimum length before this optimization is useful. -- nosy: +gregory.p.smith ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
New submission from STINNER Victor: Attached patch optimize a==b and a!=b operators for bytes and str types of Python 3.4. For str, memcmp() is now always used, instead of a loop using PyUnicode_READ() (which is slow) for kind different than 1. For bytes, compare the first but also the last byte before calling memcmp(), instead of just comparing the first byte. Similar optimization was implemented in Py_UNICODE_MATCH(): changeset: 38242:0de9a789de39 branch: legacy-trunk user:Fredrik Lundh fred...@pythonware.com date:Tue May 23 10:10:57 2006 + files: Include/unicodeobject.h description: needforspeed: check first *and* last character before doing a full memcmp Initially I only wrote the patch to check the hash values before comparing content of the strings. -- I done some statistics tests. For a fresh Python interpreter, the hash values are only known in 7% cases (but when hashes are compared, they are quite always different, so the optimization is useful). When running ./python -m test test_os, hashes are known and different in 41.4%. After running 70 tests, hashes are known and different in 80%. -- files: compare_hash.patch keywords: patch messages: 173332 nosy: haypo, serhiy.storchaka priority: normal severity: normal status: open title: Optimize a==b and a!=b for bytes and str type: performance versions: Python 3.4 Added file: http://bugs.python.org/file27623/compare_hash.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Serhiy Storchaka added the comment: Good. I would like to see similar statistics tests for any real application. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Changes by Dirkjan Ochtman dirk...@ochtman.nl: -- nosy: +djc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16286] Optimize a==b and a!=b for bytes and str
Changes by Christian Heimes li...@cheimes.de: -- nosy: +christian.heimes ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16286 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com