> >>Wouldn't it help a lot more if the compiler would detect that
> >>(1,2,3) is immutable and convert it into a constant at
> >>compile time ?!
> >
> >
> > Yes.  We've already gotten it to that point:
 . . .
> 
> Cool. Does that work for all tuples in the program ?

It is limited to just tuples of constants (strings, ints, floats,
complex, None, and other tuples).  Also, it is limited in its ability to
detect a nesting like: a=((1,2),(3,4)).  One other limitation is that
floats like -0.23 are not recognized as constants because the initial
compilation still produces a UNARY_NEGATIVE operation:

>>> dis.dis(compile('-0.23', '', 'eval'))
  0           0 LOAD_CONST               0 (0.23000000000000001)
              3 UNARY_NEGATIVE      
              4 RETURN_VALUE



> I did a search on our code and Python's std lib. It turns
> out that by far most such usages use either 2 or 3
> values in the tuple. If you look at the types of the
> values, the most common usages are strings and integers.

Right, those are the most common cases.  The linear searches are
ubiquitous.  Here's a small selection:

    if comptype not in ('NONE', 'ULAW', 'ALAW', 'G722')
    return tail.lower() in (".py", ".pyw")
    assert n in (2, 3, 4, 5)
    if value[2] in ('F','n','N')
    if sectName in ("temp", "cdata", "ignore", "include", "rcdata")
    if not decode or encoding in ('', '7bit', '8bit', 'binary'):
    if (code in (301, 302, 303, 307) and m in ("GET", "HEAD")

Unfortunately, there are several common patterns that are skipped
because rarely changed globals/builtins cannot be treated as constants:
    if isinstance(x, (int, float, complex)):  # types are not constants
    if op in (ROT_TWO, POP_TOP, LOAD_FAST):   # global consts from
opcode.py
    except (TypeError, KeyError, IndexError): # builtins are not
constant



> I'd assume that you'll get somewhat different results
> from your benchmark if you had integers in the tuple.

Nope, the results are substantially the same give or take 2usec.


 
> Hmm, what if you'd teach tuples to do faster contains lookups for
> string or integer only content, e.g. by introducing sub-types for
> string-only and integer-only tuples ?!

For a linear search, tuples are already pretty darned good and leave
room for only microscopic O(n) improvements.  The bigger win comes from
using a better algorithm and data structure -- hashing beats linear
search hands-down.  The constant search time is faster for all n>1,
resulting in much improved scalability.  No tweaking of
tuple.__contains__() can match it.

Sets are the right data structure for fast membership testing.  I would
love for sets to be used internally while letting users continue to
write the clean looking code shown above.



Raymond
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to