Profiling Chandler is tricky because there are lots of moving parts. Here are
a few tips and tricks that come to mind:
1. To compare numbers, always use the same data.
Recreating your repository everytime changes the data in subtle ways
because you get new UUIDs everytime and the order of things, random in
mappings, is conditioned by the hash of their keys, very often UUIDs.
Instead, load and prepare your data, backup the repository and always
restore it when re-running the testcase.
rc -r ~/Desktop/__repository__.001 (or whatever your path may be)
2. Preload all items, that way the vaguaries of item loading are out of the
way, unless that's what you're interested in. Running check() loads all
items into the UI view.
rc -r ~/Desktop/__repository__.001 --undo check
--undo check runs check() and undoes bogus versions until check() passes
at startup time. If check() passes right away, as I'd expect from a
clean restore, --undo will undo nothing and you will have started your
chandler will all items pre-loaded.
3. The -P flag using the perf tests produces profiles that crash hotshot (at
least, for me, on my Mac). Also, with -P you cannot be sure how much of
the perf test cruft you're profiling along with it. Know what you're
measuring, and add the profiling hook yourself.
Find the block of code or call you're interested in and wrap it
with a prof.runcall() call, creating a local function for the block of
code if need be. For example, to profile import, I just change
ImportExport.py around line 267 as follows:
import hotshot
prof = hotshot.Profile('import.log')
collection = prof.runcall(importICalendarFile,
fullpath, self.view, coll,
filterAttributes, activity,
None, logger)
prof.close()
Then process the resulting import.log file with the hotshot APIs as is
done, for example, in my t82.py file included below. Really, how to
process the hotshot output is up to you, whatever works best for the kind
of output you're looking for. t82.py, when run the first time on a .log
file, produces a .stats file from it which is vastly faster to reload.
This is can be a huge timesaver when profiling things like a large
import, for example.
4. Remove commit() calls. Currently, our code includes commits() in less
than ideal places and this has the effect of slowing things down
considerably. commit() is an expensive operation that we'd like to do
*between* user actions instead of *during* user actions. Measuring it
during a user action is pointless unless it is critical to the action.
For example, the criticality of calling commit() at the end of
CalendarCanvas.CreateEmptyEvent() is dubious.
John has offered to review all such commit calls and remove them if
possible, leaving it to the user event that started the action to do the
commit, if needed.
5. Instead of only looking at the amount of time spent doing things,
consider how many times they're done. Currently, I sense that there are a
lot of perf gains to be made in the app by doing things less instead of
faster. Why are we loading 41 images when creating an event in the
calendar, for example ? This leads to the next point.
6. The app loads things lazily. Items are loaded lazily. Images are loaded
lazily. If you see 41 images loaded, the next thing is to profile
creating a second event instead. Indeed, the calendar doesn't load 41
images the second time. Run the action multiple times so that all the
init cases are out of the way. But, we're still calling
MenusAndToolbars.appendDynamicBlocks 336 times nonetheless... something
to look into there even if we're only spending 0.007s there, the fact
it's called that often may lead you onto something where significant
time is spent overall.
7. When communicating about profiles, don't send the .log files around, they
can be huge. Use a .stats file instead. As said before, the t82.py file
below creates a .stats file from a .log file whenever it is invoked with
a .log file. Don't send a text version out either, the .stats file can be
opened with hotshot APIs and queried in various ways to look for callers,
callees, different sort orders, etc, etc. The Stats class has several
interesting APIs to poke around a .stats file:
http://docs.python.org/lib/profile-stats.html
That's all I can think of for now.
See my t82.py script below.
Andi..
----------------------------------------------------
import os
import hotshot, hotshot.stats, pstats
def cum(stats, length=50):
stats.sort_stats('cum')
stats.print_stats(length)
def calls(stats, length=50):
stats.sort_stats('time', 'calls')
stats.print_stats(length)
def show(file, call='cum', length=50):
if file.endswith('stats'):
stats = pstats.Stats(file)
else:
stats = hotshot.stats.load(file)
stats.strip_dirs()
stats.dump_stats('.'.join((os.path.splitext(file)[0], 'stats')))
if call == 'cum':
cum(stats, int(length))
elif call == 'tot':
calls(stats, int(length))
else:
stats.sort_stats(call)
stats.print_stats(int(length))
return stats
if __name__ == "__main__":
from sys import argv
show(*argv[1:])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev