Re: [Python-Dev] Prospective Peephole Transformation

2005-02-18 Thread M.-A. Lemburg
Raymond Hettinger wrote:
Based on some ideas from Skip, I had tried transforming the likes of x
in (1,2,3) into x in frozenset([1,2,3]).  When applicable, it
substantially simplified the generated code and converted the O(n)
lookup into an O(1) step.  There were substantial savings even if the
set contained only a single entry.  When disassembled, the bytecode is
not only much shorter, it is also much more readable (corresponding
almost directly to the original source).
The problem with the transformation was that it didn't handle the case
where x was non-hashable and it would raise a TypeError instead of
returning False as it should.  That situation arose once in the email
module's test suite.
To get it to work, I would have to introduce a frozenset subtype:
class Searchset(frozenset):
def __contains__(self, element):
try:
return frozenset.__contains__(self, element)
except TypeError:
return False
Then, the transformation would be x in Searchset([1, 2, 3]).  Since
the new Searchset object goes in the constant table, marshal would have
to be taught how to save and restore the object.
This is a more complicated than the original frozenset version of the
patch, so I would like to get feedback on whether you guys think it is
worth it.
Wouldn't it help a lot more if the compiler would detect that
(1,2,3) is immutable and convert it into a constant at
compile time ?!
The next step would then be to have Python roll out these
loops (in -O mode).
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source  (#1, Feb 18 2005)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] Prospective Peephole Transformation

2005-02-18 Thread Raymond Hettinger
 Wouldn't it help a lot more if the compiler would detect that
 (1,2,3) is immutable and convert it into a constant at
 compile time ?!
 
 
  Yes.  We've already gotten it to that point:
 . . .
 
 Cool. Does that work for all tuples in the program ?

It is limited to just tuples of constants (strings, ints, floats,
complex, None, and other tuples).  Also, it is limited in its ability to
detect a nesting like: a=((1,2),(3,4)).  One other limitation is that
floats like -0.23 are not recognized as constants because the initial
compilation still produces a UNARY_NEGATIVE operation:

 dis.dis(compile('-0.23', '', 'eval'))
  0   0 LOAD_CONST   0 (0.23001)
  3 UNARY_NEGATIVE  
  4 RETURN_VALUE



 I did a search on our code and Python's std lib. It turns
 out that by far most such usages use either 2 or 3
 values in the tuple. If you look at the types of the
 values, the most common usages are strings and integers.

Right, those are the most common cases.  The linear searches are
ubiquitous.  Here's a small selection:

if comptype not in ('NONE', 'ULAW', 'ALAW', 'G722')
return tail.lower() in (.py, .pyw)
assert n in (2, 3, 4, 5)
if value[2] in ('F','n','N')
if sectName in (temp, cdata, ignore, include, rcdata)
if not decode or encoding in ('', '7bit', '8bit', 'binary'):
if (code in (301, 302, 303, 307) and m in (GET, HEAD)

Unfortunately, there are several common patterns that are skipped
because rarely changed globals/builtins cannot be treated as constants:
if isinstance(x, (int, float, complex)):  # types are not constants
if op in (ROT_TWO, POP_TOP, LOAD_FAST):   # global consts from
opcode.py
except (TypeError, KeyError, IndexError): # builtins are not
constant



 I'd assume that you'll get somewhat different results
 from your benchmark if you had integers in the tuple.

Nope, the results are substantially the same give or take 2usec.


 
 Hmm, what if you'd teach tuples to do faster contains lookups for
 string or integer only content, e.g. by introducing sub-types for
 string-only and integer-only tuples ?!

For a linear search, tuples are already pretty darned good and leave
room for only microscopic O(n) improvements.  The bigger win comes from
using a better algorithm and data structure -- hashing beats linear
search hands-down.  The constant search time is faster for all n1,
resulting in much improved scalability.  No tweaking of
tuple.__contains__() can match it.

Sets are the right data structure for fast membership testing.  I would
love for sets to be used internally while letting users continue to
write the clean looking code shown above.



Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Prospective Peephole Transformation

2005-02-18 Thread Tim Peters
[Raymond Hettinger]
 ...
 The problem with the transformation was that it didn't handle the case
 where x was non-hashable and it would raise a TypeError instead of
 returning False as it should.

I'm very glad you introduced the optimization of building small
constant tuples at compile-time.  IMO, that was a pure win.

I don't like this one, though.  The meaning of x in (c1, c2, ...,
c_n) is x == c1 or x == c2  or ... or x == c_n, and a
transformation that doesn't behave exactly like the latter in all
cases is simply wrong.  Even if x isn't hashable, it could still be of
a type that implements __eq__, and where x.__eq__(c_i) returned True
for some i, and then False is plainly the wrong result.  It could also
be that x is of a type that is hashable, but where x.__hash__() raises
TypeError at this point in the code.  That could be for good or bad
(bug) reasons, but suppressing the TypeError and converting into False
would be a bad thing regardless.

 That situation arose once in the email module's test suite.

I don't even care if no code in the standard library triggered a
problem here:  the transformation isn't semantically correct on the
face of it.  If we knew the type of x at compile-time, then sure, in
most (almost all) cases we could know it was a safe transformation
(and even without the hack to turn TypeError into False).  But we
don't know now, so the worst case has to be assumed:  can't do this
one now.  Maybe someday, though.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] Prospective Peephole Transformation

2005-02-18 Thread Raymond Hettinger
 I'm very glad you introduced the optimization of building small
 constant tuples at compile-time.  IMO, that was a pure win.

It's been out in the wild for a while now with no issues.  I'm somewhat
happy with it.



  the transformation isn't semantically correct on the
 face of it.

Well that's the end of that.

What we really need is a clean syntax for specifying a constant
frozenset without compiler transformations of tuples.  That would have
the further advantage of letting builtins and globals be used as element
values.

   if isinstance(x, {int, float, complex}):
   if opcode in {REPEAT, MIN_REPEAT, MAX_REPEAT}:
   if (code in {301, 302, 303, 307} and m in {GET, HEAD}:
   if op in (ROT_TWO, POP_TOP, LOAD_FAST)

Perhaps something other notation would be better but the idea is
basically the same.



Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com