[issue16286] Optimize a==b and a!=b for bytes and str

2013-01-02 Thread STINNER Victor

STINNER Victor added the comment:

 P.S.  I rather like the optimization and don't want to discourage it.  I'm 
 just curious about what the current optimizations are missing.

I'm too lazy to produce more statistics or run other benchmarks. I just saw an 
interesting optimization oportunity. I don't understand why it was not done 
before.

If you consider that compare_hash.patch might slow down Python, so ok, I will 
just close the issue as rejected.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-12-28 Thread Meador Inge

Changes by Meador Inge mead...@gmail.com:


--
nosy: +meador.inge

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-25 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-23 Thread STINNER Victor

STINNER Victor added the comment:

Oh, I forgot this issue when I did the following commit:
--
changeset:   79902:b68be1025c42
user:Victor Stinner victor.stin...@gmail.com
date:Tue Oct 23 02:48:49 2012 +0200
files:   Objects/unicodeobject.c
description:
Optimize PyUnicode_RichCompare() for Py_EQ and Py_NE: always use memcmp()
--
I will benchmark the overhead of memcmp() on short strings. We may
check the first and last characters before calling memcmp() to limit
the overhead of calling a function.

I also read that GCC uses its builtin memcmp() which is slower than
the memcmp() of the GNU libc.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-22 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Rather than see statistics, I'm curious about what circumstances where the 
optimization would kick in.   Interned strings are pre-hashed but they already 
benefit from an identity-implies-equality check.  Dicts and sets already 
incorporate a check-hash-before-equality check.

That raises the question of what strings ever have had their hash already 
computed if the string hasn't been interned or has been used in a dict or set?

P.S.  I rather like the optimization and don't want to discourage it.  I'm just 
curious about what the current optimizations are missing.

--
nosy: +rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-20 Thread Gregory P. Smith

Gregory P. Smith added the comment:

something to include in your statistics is the lengths of the already hashed 
data being compared.

i expect there to be a minimum length before this optimization is useful.

--
nosy: +gregory.p.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-19 Thread STINNER Victor

New submission from STINNER Victor:

Attached patch optimize a==b and a!=b operators for bytes and str types of 
Python 3.4. For str, memcmp() is now always used, instead of a loop using 
PyUnicode_READ() (which is slow) for kind different than 1. For bytes, compare 
the first but also the last byte before calling memcmp(), instead of just 
comparing the first byte. Similar optimization was implemented in 
Py_UNICODE_MATCH():

changeset:   38242:0de9a789de39
branch:  legacy-trunk
user:Fredrik Lundh fred...@pythonware.com
date:Tue May 23 10:10:57 2006 +
files:   Include/unicodeobject.h
description:
needforspeed: check first *and* last character before doing a full memcmp

Initially I only wrote the patch to check the hash values before comparing 
content of the strings.

--

I done some statistics tests. For a fresh Python interpreter, the hash values 
are only known in 7% cases (but when hashes are compared, they are quite always 
different, so the optimization is useful). When running ./python -m test 
test_os, hashes are known and different in 41.4%. After running 70 tests, 
hashes are known and different in 80%.

--
files: compare_hash.patch
keywords: patch
messages: 173332
nosy: haypo, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Optimize a==b and a!=b for bytes and str
type: performance
versions: Python 3.4
Added file: http://bugs.python.org/file27623/compare_hash.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-19 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Good. I would like to see similar statistics tests for any real application.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-19 Thread Dirkjan Ochtman

Changes by Dirkjan Ochtman dirk...@ochtman.nl:


--
nosy: +djc

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16286] Optimize a==b and a!=b for bytes and str

2012-10-19 Thread Christian Heimes

Changes by Christian Heimes li...@cheimes.de:


--
nosy: +christian.heimes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com