So part of WP06 is improving the dictionary implementation. To do
this, it seemed like a good idea to find out what Python code does
with dictionaries, which is what I've been working on this week.
If you activate the "objspace.std.withmultidict" option, make
MEASURE_DICT in pypy/objspace/std/dictmultiobject.py true and build
yourself a pypy-c, you'll find that this pypy-c will create a
dictinfo.txt file that summarizes how every dictionary in the program
has been used.
The benchmark programs I have been using are: pystone, richards,
"rst2html coding-guide.txt" and "translate.py --backendopt
--no-compile --batch --text targetrpystonedalone.py", and the
(compressed) results can be found in:
http://codespeak.net/~mwh/dictinfo/
The file dictinfos.tar.bz2 contains the dictinfo.txt files created by
the above runs, and the RData files are binary files suitable for
loading into R:
http://www.r-project.org/
What I'd like to get some input on is stuff like: what aspects of this
data should I analyse? Is there any data you think I should collect?
Something that I don't measure at all is the order things happen in,
which might be interesting: it's easy to believe many dictionaries go
through a phase of being written to before a longer pahse of being
read from. But I'm not sure how to measure that...
Cheers,
mwh
--
<glyph> I am *not* a PSU agent.
-- from Twisted.Quotes
_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev