Profiling Chandler is tricky because there are lots of moving parts. Here are a few tips and tricks that come to mind:

  1. To compare numbers, always use the same data.
     Recreating your repository everytime changes the data in subtle ways
     because you get new UUIDs everytime and the order of things, random in
     mappings, is conditioned by the hash of their keys, very often UUIDs.
     Instead, load and prepare your data, backup the repository and always
     restore it when re-running the testcase.

     rc -r ~/Desktop/__repository__.001  (or whatever your path may be)

  2. Preload all items, that way the vaguaries of item loading are out of the
     way, unless that's what you're interested in. Running check() loads all
     items into the UI view.

     rc -r ~/Desktop/__repository__.001 --undo check

     --undo check runs check() and undoes bogus versions until check() passes
     at startup time. If check() passes right away, as I'd expect from a
     clean restore, --undo will undo nothing and you will have started your
     chandler will all items pre-loaded.

  3. The -P flag using the perf tests produces profiles that crash hotshot (at
     least, for me, on my Mac). Also, with -P you cannot be sure how much of
     the perf test cruft you're profiling along with it. Know what you're
     measuring, and add the profiling hook yourself.
     Find the block of code or call you're interested in and wrap it
     with a prof.runcall() call, creating a local function for the block of
     code if need be. For example, to profile import, I just change
     ImportExport.py around line 267 as follows:
            import hotshot
            prof = hotshot.Profile('import.log')
            collection = prof.runcall(importICalendarFile,
                                      fullpath, self.view, coll,
                                      filterAttributes, activity,
                                      None, logger)
            prof.close()
     Then process the resulting import.log file with the hotshot APIs as is
     done, for example, in my t82.py file included below. Really, how to
     process the hotshot output is up to you, whatever works best for the kind
     of output you're looking for. t82.py, when run the first time on a .log
     file, produces a .stats file from it which is vastly faster to reload.
     This is can be a huge timesaver when profiling things like a large
     import, for example.

  4. Remove commit() calls. Currently, our code includes commits() in less
     than ideal places and this has the effect of slowing things down
     considerably. commit() is an expensive operation that we'd like to do
     *between* user actions instead of *during* user actions. Measuring it
     during a user action is pointless unless it is critical to the action.
     For example, the criticality of calling commit() at the end of
     CalendarCanvas.CreateEmptyEvent() is dubious.
     John has offered to review all such commit calls and remove them if
     possible, leaving it to the user event that started the action to do the
     commit, if needed.

  5. Instead of only looking at the amount of time spent doing things,
     consider how many times they're done. Currently, I sense that there are a
     lot of perf gains to be made in the app by doing things less instead of
     faster. Why are we loading 41 images when creating an event in the
     calendar, for example ? This leads to the next point.

  6. The app loads things lazily. Items are loaded lazily. Images are loaded
     lazily. If you see 41 images loaded, the next thing is to profile
     creating a second event instead. Indeed, the calendar doesn't load 41
     images the second time. Run the action multiple times so that all the
     init cases are out of the way. But, we're still calling
     MenusAndToolbars.appendDynamicBlocks 336 times nonetheless... something
     to look into there even if we're only spending 0.007s there, the fact
     it's called that often may lead you onto something where significant
     time is spent overall.

  7. When communicating about profiles, don't send the .log files around, they
     can be huge. Use a .stats file instead. As said before, the t82.py file
     below creates a .stats file from a .log file whenever it is invoked with
     a .log file. Don't send a text version out either, the .stats file can be
     opened with hotshot APIs and queried in various ways to look for callers,
     callees, different sort orders, etc, etc. The Stats class has several
     interesting APIs to poke around a .stats file:
        http://docs.python.org/lib/profile-stats.html

That's all I can think of for now.
See my t82.py script below.

Andi..
----------------------------------------------------

import os
import hotshot, hotshot.stats, pstats

def cum(stats, length=50):
    stats.sort_stats('cum')
    stats.print_stats(length)

def calls(stats, length=50):
    stats.sort_stats('time', 'calls')
    stats.print_stats(length)

def show(file, call='cum', length=50):
    if file.endswith('stats'):
        stats = pstats.Stats(file)
    else:
        stats = hotshot.stats.load(file)
        stats.strip_dirs()
        stats.dump_stats('.'.join((os.path.splitext(file)[0], 'stats')))
    if call == 'cum':
        cum(stats, int(length))
    elif call == 'tot':
        calls(stats, int(length))
    else:
        stats.sort_stats(call)
        stats.print_stats(int(length))

    return stats

if __name__ == "__main__":
    from sys import argv
    show(*argv[1:])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Reply via email to