Raymond Hettinger wrote:
Wouldn't it help a lot more if the compiler would detect that
(1,2,3) is immutable and convert it into a constant at
compile time ?!
Yes. We've already gotten it to that point:
Python 2.5a0 (#46, Feb 15 2005, 19:11:35) [MSC v.1200 32 bit (Intel)] on
win32
import dis
dis.dis(compile('x in ("xml", "html", "css")', '', 'eval'))
0 0 LOAD_NAME 0 (x)
3 LOAD_CONST 3 (('xml', 'html', 'css'))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
Cool. Does that work for all tuples in the program ?
The question is whether to go a step further to replace the linear
search with a single hashed lookup:
0 0 LOAD_NAME 0 (x)
3 LOAD_CONST 3 (searchset(['xml', 'html',
'css']))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
This situation seems to arise often in source code. You can see the
cases in the standard library with: grep 'in ("' *.py
I did a search on our code and Python's std lib. It turns
out that by far most such usages use either 2 or 3
values in the tuple. If you look at the types of the
values, the most common usages are strings and integers.
I'd assume that you'll get somewhat different results
from your benchmark if you had integers in the tuple.
The transformation is easy to make at compile time. The part holding me
back is the introduction of searchset as a frozenset subtype and
teaching marshal how to put it a pyc file.
Hmm, what if you'd teach tuples to do faster contains lookups for
string or integer only content, e.g. by introducing sub-types for
string-only and integer-only tuples ?!
FWIW, some sample timings are included below (using frozenset to
approximate what searchset would do). The summary is that the tuple
search takes .49usec plus .12usec for each item searched until a match
is found. The frozenset lookup takes a constant .53 usec.
Raymond
------------------------------------------------------------------------
C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='xml'"
"x in s"
1000000 loops, best of 9: 0.49 usec per loop
C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='css'"
"x in s"
1000000 loops, best of 9: 0.621 usec per loop
C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='html'"
"x in s"
1000000 loops, best of 9: 0.747 usec per loop
C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='pdf'"
"x in s"
100000 loops, best of 9: 0.851 usec per loop
C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='xml'" "x in s"
1000000 loops, best of 9: 0.529 usec per loop
C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='css'" "x in s"
1000000 loops, best of 9: 0.522 usec per loop
C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='html'" "x in s"
1000000 loops, best of 9: 0.53 usec per loop
C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
"x='pdf'" "x in s"
1000000 loops, best of 9: 0.523 usec per loop
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Feb 19 2005)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com