On Thu, 2010-10-07 at 14:59 -0700, David Borowitz wrote: > I have some benchmark results for namedtuples vs. tuples. First, the > microbenchmark results: > $ python -m timeit -s 'from dulwich.objects import TreeEntry; name = > "foo/bar"; mode = 0100644; sha = "a" * 20' 'x = TreeEntry(name, mode, > sha)' > 1000000 loops, best of 3: 0.583 usec per loop > $ python -m timeit -s 'name = "foo/bar"; mode = 0100644; sha = "a" * > 20' 'x = (name, mode, sha)' > 10000000 loops, best of 3: 0.0753 usec per loop
> Obviously the tuple constructor should win over TreeEntry constructor, > since the latter is a wrapper around the former, and there's > significant Python function call overhead. But hey, 0.5us is still > pretty fast. > Then I ran a much bigger macrobenchmark (attached). Basically, I > cloned git.git, ran git unpack-object to explode the repo into loose > files, found all the tree SHAs (without parsing the objects), then > measured the time to parse all those trees and iterate all their > entries. In the inner loop I also assigned all of the tuple/namedtuple > values to locals to check for overhead there. Thanks very much for doing those benchmarks. With these in mind, I would be fine with either solution. If we would really have to improve on this in the future we could always look into doing a C version of TreeEntry and have the C implementation of sorted_tree_items return that directly. That said, it doesn't look like there's a need for that (and micro-optimization is bad). Cheers, Jelmer
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Mailing list: https://launchpad.net/~dulwich-users Post to : [email protected] Unsubscribe : https://launchpad.net/~dulwich-users More help : https://help.launchpad.net/ListHelp

