On 7/7/2010 4:11 PM, Tres Seaver wrote:

Antoine Pitrou wrote:
On Wed, 7 Jul 2010 19:44:31 +0200
Eli Bendersky<eli...@gmail.com>  wrote:
For what it's worth, my benchmarking showed that modifying the
heuristic to only kick in when there are more than 100 kinds of
elements (Terry's option A) didn't affect the runtime of matching
whatsoever, even when the heuristic *does* kick in. All it adds,
really, is the overhead of a single 'if' statement. So it wouldn't be
right to assume that somehow modifying the heuristic or allowing to
turn it off will negatively affect performance in the special case Tim
originally optimized for.

Just because it doesn't affect performance in your tests doesn't mean it
won't do so in the general case. Consider a case where Tim's junk
optimization kicked in and helped improve performance a lot, but where
there are still less than 100 alphabet symbols. The new heuristic will
ruin this use case.

That would describe pretty much every C program ever written, for
instance, and nearly as high a percentage of all Python modules /
scripts ever written:  the 'string.printable' set, less formfeed and
vertical tab, is 98 characters long.

In the primary use case, programs are compared by line, not characters, and there are more than 100 different lines in any sensible program of at least 200 lines.


--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to