> The Wiki entry seems to reinforce the impression that bugged Guido to > begin with. It provides a bunch of "but ..." explanations about why > Python's speed isn't that important. Python is slow, but "speed of > development is far more important."
I felt the same way when reading it. Also, it seemed to embody the political outlook that optimization is sinful. The document could be much more positive, fact based, and informative. Also, the wording seems somewhat outdated. A draft for a new entry is included below. Review and feedback are requested. Raymond Techniques for Improving Performance and Scalability ==================================================== Here are some coding guidelines for applications that demand peek performance (in terms of memory utilization, speed, or scalability). Use the best algorithms and fastest tools ----------------------------------------- . Membership testing with sets and dictionaries is much faster, O(1), than searching sequences, O(n). When testing "a in b", b should be a set or dictionary instead of a list or tuple. . String concatenation is best done with ''.join(seq) which is an O(n) process. In contrast, using the '+' or '+=' operators can result in an O(n**2) process because new strings may be built for each intermediate step. Py2.4 mitigates this issue somewhat; however, ''.join(seq) remains the best practice. . Many tools come in both list form and iterator form (map and itertools.imap, list comprehension and generator expressions, dict.items and dict.iteritems). In general, the iterator forms are more memory friendly and more scalable. They are preferred whenever a real list is not required. . Many core building blocks are coded in optimized C. Applications that take advantage of them can make substantial performance gains. The building blocks include all of the builtin datatypes (lists, tuples, sets, and dictionaries) and extension modules like array, itertools, and collections.deque. . Likewise, the builtin functions run faster than hand-built equivalents. For example, map(operator.add, v1, v2) is faster than map(lambda x,y: x+y, v1, v2). . List comprehensions run a bit faster than equivalent for-loops. . Lists perform well as fixed length arrays and as stacks. For queue applications using pop(0) or insert(0,v)), collections.deque() offers superior O(1) performance because it avoids rebuilding a full list for each insertion or deletion, an O(n) process. . Custom sort ordering is best performed with Py2.4's key= option or with the traditional decorate-sort-undecorate technique. Both approaches call the key function just once per element. In contrast, sort's cmp= option is called many times per element during a sort. For example, sort(key=str.lower) is faster than sort(cmp=lambda a,b: cmp(a.lower(), b.lower())). Take advantage of interpreter optimizations ------------------------------------------- . In functions, local variables are accessed more quickly than global variables, builtins, and attribute lookups. So, it is sometimes worth localizing variable access in inner-loops. For example, the code for random.shuffle() localizes access with the line, random=self.random. That saves the shuffling loop from having to repeatedly lookup self.random. Outside of loops, the gain is minimal and rarely worth it. . The previous recommendation is a generalization of the rule to factor constant expressions out of loops. Also, constant folding should be done manually. Inside loops, write "x=3" instead of "x=1+2". . Function call overhead is large compared to other instructions. Accordingly, it is sometimes worth in-lining code in the middle of a few time-critical loops. . Starting with Py2.3, the interpreter optimizes "while 1" to just a single jump. In contrast "while True" takes several more steps. While the latter is preferred for clarity, time critical code should use the first form. . Multiple assignment is slower than individual assignment. For example "x,y=a,b" is slower than "x=a; y=b". However, multiple assignment is faster for variable swaps. For example, "x,y=y,x" is faster than "t=x; x=y; y=t". . Chained comparisons are faster than using the "and" operator. Write "x < y < z" instead of "x < y and y < z". . A few fast approaches should be considered hacks and reserved for only the most demanding applications. For example, "x and True or False" is faster than "bool(x)". Take advantage of diagnostic tools ---------------------------------- . The hotshot and profile modules help identify performance bottlenecks. . The timeit module offers immediate performance comparisons between alternative approaches to writing individual lines of code. Performance can dictate overall strategy ---------------------------------------- . SAX is typically faster and more memory efficient than DOM approaches to XML. . Threading can improve the response time in applications that would otherwise waste time waiting for I/O. . Select can help minimize the overhead for polling multiple sockets. _______________________________________________ Python-Dev mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com