Re: [Python-Dev] Prospective Peephole Transformation
Raymond Hettinger wrote: Based on some ideas from Skip, I had tried transforming the likes of x in (1,2,3) into x in frozenset([1,2,3]). When applicable, it substantially simplified the generated code and converted the O(n) lookup into an O(1) step. There were substantial savings even if the set contained only a single entry. When disassembled, the bytecode is not only much shorter, it is also much more readable (corresponding almost directly to the original source). The problem with the transformation was that it didn't handle the case where x was non-hashable and it would raise a TypeError instead of returning False as it should. That situation arose once in the email module's test suite. To get it to work, I would have to introduce a frozenset subtype: class Searchset(frozenset): def __contains__(self, element): try: return frozenset.__contains__(self, element) except TypeError: return False Then, the transformation would be x in Searchset([1, 2, 3]). Since the new Searchset object goes in the constant table, marshal would have to be taught how to save and restore the object. This is a more complicated than the original frozenset version of the patch, so I would like to get feedback on whether you guys think it is worth it. Wouldn't it help a lot more if the compiler would detect that (1,2,3) is immutable and convert it into a constant at compile time ?! The next step would then be to have Python roll out these loops (in -O mode). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2005) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] Prospective Peephole Transformation
Wouldn't it help a lot more if the compiler would detect that (1,2,3) is immutable and convert it into a constant at compile time ?! Yes. We've already gotten it to that point: . . . Cool. Does that work for all tuples in the program ? It is limited to just tuples of constants (strings, ints, floats, complex, None, and other tuples). Also, it is limited in its ability to detect a nesting like: a=((1,2),(3,4)). One other limitation is that floats like -0.23 are not recognized as constants because the initial compilation still produces a UNARY_NEGATIVE operation: dis.dis(compile('-0.23', '', 'eval')) 0 0 LOAD_CONST 0 (0.23001) 3 UNARY_NEGATIVE 4 RETURN_VALUE I did a search on our code and Python's std lib. It turns out that by far most such usages use either 2 or 3 values in the tuple. If you look at the types of the values, the most common usages are strings and integers. Right, those are the most common cases. The linear searches are ubiquitous. Here's a small selection: if comptype not in ('NONE', 'ULAW', 'ALAW', 'G722') return tail.lower() in (.py, .pyw) assert n in (2, 3, 4, 5) if value[2] in ('F','n','N') if sectName in (temp, cdata, ignore, include, rcdata) if not decode or encoding in ('', '7bit', '8bit', 'binary'): if (code in (301, 302, 303, 307) and m in (GET, HEAD) Unfortunately, there are several common patterns that are skipped because rarely changed globals/builtins cannot be treated as constants: if isinstance(x, (int, float, complex)): # types are not constants if op in (ROT_TWO, POP_TOP, LOAD_FAST): # global consts from opcode.py except (TypeError, KeyError, IndexError): # builtins are not constant I'd assume that you'll get somewhat different results from your benchmark if you had integers in the tuple. Nope, the results are substantially the same give or take 2usec. Hmm, what if you'd teach tuples to do faster contains lookups for string or integer only content, e.g. by introducing sub-types for string-only and integer-only tuples ?! For a linear search, tuples are already pretty darned good and leave room for only microscopic O(n) improvements. The bigger win comes from using a better algorithm and data structure -- hashing beats linear search hands-down. The constant search time is faster for all n1, resulting in much improved scalability. No tweaking of tuple.__contains__() can match it. Sets are the right data structure for fast membership testing. I would love for sets to be used internally while letting users continue to write the clean looking code shown above. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Prospective Peephole Transformation
[Raymond Hettinger] ... The problem with the transformation was that it didn't handle the case where x was non-hashable and it would raise a TypeError instead of returning False as it should. I'm very glad you introduced the optimization of building small constant tuples at compile-time. IMO, that was a pure win. I don't like this one, though. The meaning of x in (c1, c2, ..., c_n) is x == c1 or x == c2 or ... or x == c_n, and a transformation that doesn't behave exactly like the latter in all cases is simply wrong. Even if x isn't hashable, it could still be of a type that implements __eq__, and where x.__eq__(c_i) returned True for some i, and then False is plainly the wrong result. It could also be that x is of a type that is hashable, but where x.__hash__() raises TypeError at this point in the code. That could be for good or bad (bug) reasons, but suppressing the TypeError and converting into False would be a bad thing regardless. That situation arose once in the email module's test suite. I don't even care if no code in the standard library triggered a problem here: the transformation isn't semantically correct on the face of it. If we knew the type of x at compile-time, then sure, in most (almost all) cases we could know it was a safe transformation (and even without the hack to turn TypeError into False). But we don't know now, so the worst case has to be assumed: can't do this one now. Maybe someday, though. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] Prospective Peephole Transformation
I'm very glad you introduced the optimization of building small constant tuples at compile-time. IMO, that was a pure win. It's been out in the wild for a while now with no issues. I'm somewhat happy with it. the transformation isn't semantically correct on the face of it. Well that's the end of that. What we really need is a clean syntax for specifying a constant frozenset without compiler transformations of tuples. That would have the further advantage of letting builtins and globals be used as element values. if isinstance(x, {int, float, complex}): if opcode in {REPEAT, MIN_REPEAT, MAX_REPEAT}: if (code in {301, 302, 303, 307} and m in {GET, HEAD}: if op in (ROT_TWO, POP_TOP, LOAD_FAST) Perhaps something other notation would be better but the idea is basically the same. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com