Answers inline.

Heikki Toivonen wrote:
1. Are regression bugs usable to you?
No
2. Are trend graphs usable to you?
http://builds.osafoundation.org/perf_data/trends.html
Yes, to the extent that I can see change over time, and consistent variability between platforms. Also, this is the only place where it looks like we've made any progress.
3. Are the tables from which trend graphs are drawn from usable to you?
For example
http://builds.osafoundation.org/perf_data/detail_20070318_20070417.html
No: too variable.
4. Are the full results for today usable to you? For example
http://builds.osafoundation.org/perf_data/detail_20070417.html
No: too variable, sample size too small.
5. Are the daily graphs usable to you? For example
http://builds.osafoundation.org/perf_data/detail_20070417.html
No. too variable, sample size too small.
6. Are historical daily reports usable to you? For example
http://builds.osafoundation.org/perf_data/detail_20070416.html
No. too variable, sample size too small.
7. Are the deltas and std.dev usable to you on
http://builds.osafoundation.org/perf_data/tbox.html
No, too variable, misleading color coding of both time and delta boxes. Std devs ridiculously large (which is not a problem of presentation: it's a problem of measurement).
8. If you would like to change the colors or format of
http://builds.osafoundation.org/perf_data/tbox.html please list the
changes here:
Until the perf tests produce more reliable timings, I don't think I'll need this for anything.

It's too depressing to look at in detail: I block it out when looking at the tinderbox page. I really only use it to get to the "trends" link in the top left corner.

9. Is rt.py -p usable to you?
I've only used it to verify that I haven't broken anything before checking in a change to a perf test. I don't run it before normal checkins.

10. Is rt.py -t usable to you?
Yes, when perf testing and investigating failures of normal tests.

When looking at performance, I use it to run a specific perf test, only to see what's being executed by that test; I then manually add hotshotting to get statistics about particular parts of the code.

I usually run it with --verbose, so I can capture the command line it's using to start Chandler. I then use that command line (with --indexer none) to get timings from the hotshot calls I've added. I never look at the numbers other than hotshot (so absolute time doesn't mean anything: I'm not on typical hardware, I'm running with optimization off so that I can debug, etc).

11. Is rt.py -P usable to you?
No.
12. Is rt.py --repeat usable to you?
No.
13. Any other things you can think of that either add value or are
irrelevant or you would like changed:



_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Reply via email to