proposal: replace talos with inline tests
For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off. So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term. Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high. Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. 4) port existing talos tests over to the mochitest framework. 5) drop talos. Curious what people think of this idea. Jim ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
(CCing auto-to...@mozilla.com) jmaher and jhammel will be able to comment more on the talos specifics, but few thoughts off the top of my head: It seems like we're conflating multiple issues here: 1) [talos] is a separate repo from mc 2) [it's a hassle to] test the test to be sure it’s working 3) [it's a hassle to get results] populated in graph server 4) [we need to] come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted Switching from the talos harness to mochitest doesn't fix #2 (we still have to test, and I don't see how it magically becomes any easier without extra work - that could have been applied to talos instead) or #3/#4 (orthogonal problem). It also seems like a brute force way of fixing #1 (we could just check talos into mozilla-central). Instead, I think we should be asking: 1) Is the best test framework for performance testing: [a] talos (with improvements), [b] mochitest (with a significant amount of work to make it compatible), or [c] a brand new framework? 2) Regardless of framework used, would checking it into mozilla-central improve dev workflow enough to outweigh the downsides (see bug 787200 for history on that discussion)? 3) Regardless of framework used, how can we make the development/testing/staging cycle less painful? 4) Regardless of framework used, who should be responsible for ensuring we actively prune performance tests that are no longer relevant? Note also that graphs.mozilla.org will be depreciated soon, in favour of datazilla - which afaik is less painful for adding new test suites (eg doesn't need manual database changes); jeads can say more on that front. Best wishes, Ed On 04 March 2013 13:15:56, Jim Mathies wrote: For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off. So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term. Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high. Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. 4) port existing talos tests over to the mochitest framework. 5) drop talos. Curious what people think of this idea. Jim ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
Some thoughts on the subject- I would argue against running performance tests inside of mochitest. The main reason is that mochitest has a lot of profile stuff for testing as well as many other tests bundled inside of the same browser session. For a standalone metric unrelated to a user scenario, we could consider performance style tests into mochitest. In the process of creating Datazilla, we have found endless little quirks in the end to end system how performance works. As time goes on we have continued to push forward with the goal of making a performance system that can detect regressions automatically when the test finishes. For the last few months we have had data going both to Datazilla and graph server and have been refining our assumptions and tools along the way. When graph server is deprecated in the near future, it will be REALLY EASY to add new tests to the collection and reporting system. That doesn't solve the problem of making it easy to add or adjust a test in the test runners (buildbot scripts), but it solves half the problem. Many of the talos tests are old and outdated and while we have tried to find owners for the tests, it has been a failing effort. To that tune, we have disabled some Talos tests which nobody had interest in anymore. If there are tests which people feel are not useful, we should disable those ASAP to reduce our load on our infrastructure and work on creating a test which people care about. -Joel - Original Message - From: Ed Morley emor...@mozilla.com To: Jim Mathies jmath...@mozilla.com, auto-to...@mozilla.com Cc: dev-platform@lists.mozilla.org Sent: Monday, March 4, 2013 8:42:39 AM Subject: Re: proposal: replace talos with inline tests (CCing auto-to...@mozilla.com) jmaher and jhammel will be able to comment more on the talos specifics, but few thoughts off the top of my head: It seems like we're conflating multiple issues here: 1) [talos] is a separate repo from mc 2) [it's a hassle to] test the test to be sure it’s working 3) [it's a hassle to get results] populated in graph server 4) [we need to] come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted Switching from the talos harness to mochitest doesn't fix #2 (we still have to test, and I don't see how it magically becomes any easier without extra work - that could have been applied to talos instead) or #3/#4 (orthogonal problem). It also seems like a brute force way of fixing #1 (we could just check talos into mozilla-central). Instead, I think we should be asking: 1) Is the best test framework for performance testing: [a] talos (with improvements), [b] mochitest (with a significant amount of work to make it compatible), or [c] a brand new framework? 2) Regardless of framework used, would checking it into mozilla-central improve dev workflow enough to outweigh the downsides (see bug 787200 for history on that discussion)? 3) Regardless of framework used, how can we make the development/testing/staging cycle less painful? 4) Regardless of framework used, who should be responsible for ensuring we actively prune performance tests that are no longer relevant? Note also that graphs.mozilla.org will be depreciated soon, in favour of datazilla - which afaik is less painful for adding new test suites (eg doesn't need manual database changes); jeads can say more on that front. Best wishes, Ed On 04 March 2013 13:15:56, Jim Mathies wrote: For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off. So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term. Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high. Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. 4
Re: proposal: replace talos with inline tests
Good points, comments below. Ed Morley emor...@mozilla.com wrote in message news:mailman.1992.1362404580.24452.dev-platf...@lists.mozilla.org... (CCing auto-to...@mozilla.com) jmaher and jhammel will be able to comment more on the talos specifics, but few thoughts off the top of my head: It seems like we're conflating multiple issues here: 1) [talos] is a separate repo from mc 2) [it's a hassle to] test the test to be sure it’s working 3) [it's a hassle to get results] populated in graph server 4) [we need to] come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted Switching from the talos harness to mochitest doesn't fix #2 (we still have to test, and I don't see how it magically becomes any easier without extra work - that could have been applied to talos instead) I disagree here, very few devs are familiar with the talos framework and what it takes to get a new test written. Everyone is very familiar with mochitest and other related test frameworks on mc. I can write a mochitest to test perf in something simple like scrolling in about an hour. Putting together a talos scroll test would take much longer. If Talos were on mc it would help, but integrating into existing test frameworks we have and use on a regular basis seems like the simplest approach with the least amount of overhead. Instead, I think we should be asking: 1) Is the best test framework for performance testing: [a] talos (with improvements), [b] mochitest (with a significant amount of work to make it compatible), or [c] a brand new framework? On [b] there might be a significant amount of work in getting infra pieces to work maybe (like graph server or whatever we plan to replace it with) but not in writing a import module that devs would use to post data. 2) Regardless of framework used, would checking it into mozilla-central improve dev workflow enough to outweigh the downsides (see bug 787200 for history on that discussion)? Maybe we might want to keep talos around for big, important tests. But I think devs need a way to run perf tests on a smaller scale that doesn't involve infra changes. I think having this ability would be a big win for us. Jim ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
On 3/4/13 8:15 AM, Jim Mathies wrote: So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. How are you dealing with the fact that mochitest runs on heterogeneous hardware (including VMs and the like last I checked, which could have arbitrarily bad (or good!) performance characteristics depending on what else is happening with the host system)? Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. 4) port existing talos tests over to the mochitest framework. 5) drop talos. This sounds plausible, modulo the inability to port Tp in its current state to a setup that involves the tests living in m-c, as long as the problem above is kept in mind. Basically, reusing something mochitest-like for developer familiarity may make sense, but it would need to be a separate test suite run on completely separate test slaves that are actually set up with performance testing in mind. A separate test suite which is like mochitest is not a problem per se (we have the ipcplugins, chrome, browserchrome, a11y tests already). So the main win would be making it easier to add new tests in terms of number of actions to be taken (something it seems like we could improve with the current Talos setup too) and easier for developers to add tests because the framework is already similar, right? -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
On 3/4/13 5:15 AM, Jim Mathies wrote: For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off. So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term. Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high. Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. 4) port existing talos tests over to the mochitest framework. 5) drop talos. Curious what people think of this idea. Generally speaking, I think we should have a generic framework for declaring tests. i.e. test files for xpcshell, mochitest, Talos, etc would all look very similar from a JS perspective. I've been wanting to unify the in-test code for a while and over the weekend I put together a very rough draft of what I think this should look like [1]. Please criticize it. If all your tests are declared the same way, then presumably the test running code would be similar and capturing performance data would require a single implementation affecting all test suites instead of N 1-off solutions. I'm of the opinion that would should generally collect tons of data from all of our testing frameworks and then sort out the meaning of that data later (e.g. ignore data from tests running on non-homogenous or unreliable hardware). Maybe we don't care about things like rev X-Y comparison of CPU cycles on an individual mochitest. But, we'd certainly be interested if we saw an individual mochitest's CPU cycle count or wall time double over the span of a month! You can't even raise eyebrows unless you have data. We don't have this data today. Even if we did, it would require separate implementations for each testing flavor (xpcshell, mochitest, etc). We should unify our test running code as much as possible. Then, we should make decisions on whether it makes sense to collect and/or assess performance data in each execution context/test flavor. [1] https://gist.github.com/indygreg/5073810 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
Boris Zbarsky bzbar...@mit.edu wrote in message news:o7ydnyp6n66okqnmnz2dnuvz_uwdn...@mozilla.org... On 3/4/13 8:15 AM, Jim Mathies wrote: So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. How are you dealing with the fact that mochitest runs on heterogeneous hardware (including VMs and the like last I checked, which could have arbitrarily bad (or good!) performance characteristics depending on what else is happening with the host system)? That sounds like a rel eng problem that could be solved. I don’t know our enough about our test slaves to say for sure. This sounds plausible, modulo the inability to port Tp in its current state to a setup that involves the tests living in m-c, as long as the problem above is kept in mind. Basically, reusing something mochitest-like for developer familiarity may make sense, but it would need to be a separate test suite run on completely separate test slaves that are actually set up with performance testing in mind. A separate test suite which is like mochitest is not a problem per se (we have the ipcplugins, chrome, browserchrome, a11y tests already). That's fine, I'm not married to mochitest, but something similar using the similar run characteristics would be best. So the main win would be making it easier to add new tests in terms of number of actions to be taken (something it seems like we could improve with the current Talos setup too) and easier for developers to add tests because the framework is already similar, right? -Boris Yes, basically - 1) something checked into mc anyone can easily author or run (for tracking down regressions) without having to checkout a separate repo, or setup and run a custom perf test framework. 2) performance tests that generate data that spits out to the console on local runs or could be posted to a graphs server in automation. 3) no releng overhead for setup of new perf tests. something that is built into the test framework / infrastructure we set up. Jim ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
1) something checked into mc anyone can easily author or run (for tracking down regressions) without having to checkout a separate repo, or setup and run a custom perf test framework. I don't oppose the gist of what you're suggesting here, but please keep in mind that small perf changes are often very difficult to track down locally. Small changes in system and toolchain configuration can have large effects on average build speed and its variance. For example, I've found observable performance differences between Try and m-c/m-i builds in the past (bug 653961), despite their build configs being nearly identical. In my experience, we spend the majority of our time trying to track down small perf changes, so a change which makes it easier to track down the source of large perf changes might not have an outsize effect. 3) no releng overhead for setup of new perf tests. something that is built into the test framework / infrastructure we set up. If we did this, we'd need to figure out how and when to promote benchmarks to we care about them status. We already don't back back out changes for regressing a benchmark like we back them out for regressing tests. I think this is at least partially because a general sentiment that not all of our benchmarks correlate strongly to what they're trying to measure. I suspect if anyone could check in a benchmark, the average quality of benchmarks would likely stay roughly the same, but the number of benchmarks would increase. In that case we'd have even more benchmarks with spurious regressions to deal with. -Justin ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
On 3/4/13 9:36 AM, Gregory Szorc wrote: If all your tests are declared the same way, then presumably the test running code would be similar and capturing performance data would require a single implementation affecting all test suites instead of N 1-off solutions. We've talked about this before (perhaps in this very newsgroup), as a cheap (?) way to get extra perf measurements beyond our current limited set of tests, and to avoid having to add a new test suite/framework whenever someone wants a metric... E.G. measure the run time of each existing test, use scripts to figure out which ones are fairly stable over time, then watch for regressions. A chance to begin again in a orange land of opportunity and adventure! But I'd also take the general ability to add a new test as a microbenchmark. We should unify our test running code as much as possible. Oh god yes please. Justin ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
On Monday, March 4, 2013 5:15:56 AM UTC-8, Jim Mathies wrote: For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off. So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term. I think this is an incredibly interesting proposal, and I'd love to see something like it go forward. Detailed reactions below. Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high. Yup. And that's a big problem. Not only does this make your life harder, it makes people not do as much performance testing as they otherwise might. The JS team has had the experience that adding a new way of creating correctness tests incredibly easy (with *zero* overhead in the common case) really helped get more tests written and used. So I think it would be great to make it a lot easier to write perf tests. Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. IIUC, something like this is a key requirement: letting any perf test feed into the reporting system. People have pointed out that the Talos tests run on selected machines, so the perf tests should probably run on them as well, rather than on the correctness test machines. But that's only a small change to the basic idea, right? 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. Does that mean a mochitest module? This part seems optional, although certainly useful. Some tests will require non-mochitest frameworks. I believe jmaher did some work to get in-browser standard JS benchmarks running automatically and reporting to graph-server. I'm curious how that would fit in with this idea--would the test module help at all, or could there be some other kind of more general module maybe, so that even things like standard benchmarks can be self-serve? 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. :-) How about getting an owner optionally listed for new tests, and then tests will be removed if no one is looking at them (according to web server logs) and there is no owner of record or the owner doesn't say the tests are still needed? 4) port existing talos tests over to the mochitest framework. 5) drop talos. This seems like it's in the line of fix Talos. I'm not sure if this particular 4+5 is the right way to go, but the idea has some merit. To the extent that people don't pay attention to Talos, it seems we really don't need to do anything with it. If people are paying attention to and taking care of performance in their area, then we're covered. To take the example I happen to know best, the JS team uses AWFY to track JS performance on standard benchmarks and additional tests they've decided are useful. So Talos is not needed to track JS performance. Having all the features of the new graph server does sound pretty cool, though. It appears that there a few areas that are only covered by Talos for now, though. I think in that category we have warm startup time via Ts, and basic layout performance via Tp. I'm not sure about memory, because we do seem to detect increases via Talos, but we also have AWSY, and I don't know whether AWSY obviates Talos memory measurements or not. For that kind of thing, I'm thinking maybe we should go with the same teams take care of their own perf tests idea. Performance is a natural owner for Ts. I'm not entirely sure about Tp, but it's probably layout or DOM. Then those teams could decide if they wanted to switch from Talos to a different framework. If everything's working properly, if the difficulty of reproducing Talos tests locally caused enough problems to warrant it, the owning teams would notice and switch. Dave ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
Writing a lot of performance tests creates the problem that those tests will take a long time to run. The nature of performance tests is that each test must run for a relatively long time to get meaningful results. Therefore I doubt writing lots of different performance tests can scale. (Maybe we can find ways to eliminate noise in very short tests, but that might be research.) One other thing to keep in mind if we're going to start doing performance tests differently is https://bugzilla.mozilla.org/show_bug.cgi?id=846166. Basically Chris suggests using eideticker for performance tests a lot more. Rob -- Wrfhf pnyyrq gurz gbtrgure naq fnvq, “Lbh xabj gung gur ehyref bs gur Tragvyrf ybeq vg bire gurz, naq gurve uvtu bssvpvnyf rkrepvfr nhgubevgl bire gurz. Abg fb jvgu lbh. Vafgrnq, jubrire jnagf gb orpbzr terng nzbat lbh zhfg or lbhe freinag, naq jubrire jnagf gb or svefg zhfg or lbhe fynir — whfg nf gur Fba bs Zna qvq abg pbzr gb or freirq, ohg gb freir, naq gb tvir uvf yvsr nf n enafbz sbe znal.” [Znggurj 20:25-28] ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
I'll point out and really this is about all I have to say on this thread that while perf testing (that is, recording results) may bewell, not easy, but not too awful that rigorous analysis of what the data means and if there is a regression is often hard since it is often the case, as evidenced by Talos, that distributions are non-normal and may be multi-modal. While I have no love of Talos, despite/because of sinking a year's worth of effort into it, I fear that any replacement will be done with a loss of all wisdom harvested from legacy, and then relearned. If each team is responsible for perf testing, without a common basis and understanding of the stats analysis problem, I fear this will just multiply the problem. Frankly, one of the problems I've seen time and time again is the duplication of effort around a problem (which isn't a bad thing except...) and a lack of consolidation towards a (moz-)universal solution. On 03/04/2013 04:47 PM, Dave Mandelin wrote: On Monday, March 4, 2013 5:15:56 AM UTC-8, Jim Mathies wrote: For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off. So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term. I think this is an incredibly interesting proposal, and I'd love to see something like it go forward. Detailed reactions below. Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high. Yup. And that's a big problem. Not only does this make your life harder, it makes people not do as much performance testing as they otherwise might. The JS team has had the experience that adding a new way of creating correctness tests incredibly easy (with *zero* overhead in the common case) really helped get more tests written and used. So I think it would be great to make it a lot easier to write perf tests. Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically: 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks. IIUC, something like this is a key requirement: letting any perf test feed into the reporting system. People have pointed out that the Talos tests run on selected machines, so the perf tests should probably run on them as well, rather than on the correctness test machines. But that's only a small change to the basic idea, right? 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server. Does that mean a mochitest module? This part seems optional, although certainly useful. Some tests will require non-mochitest frameworks. I believe jmaher did some work to get in-browser standard JS benchmarks running automatically and reporting to graph-server. I'm curious how that would fit in with this idea--would the test module help at all, or could there be some other kind of more general module maybe, so that even things like standard benchmarks can be self-serve? 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted. :-) How about getting an owner optionally listed for new tests, and then tests will be removed if no one is looking at them (according to web server logs) and there is no owner of record or the owner doesn't say the tests are still needed? 4) port existing talos tests over to the mochitest framework. 5) drop talos. This seems like it's in the line of fix Talos. I'm not sure if this particular 4+5 is the right way to go, but the idea has some merit. To the extent that people don't pay attention to Talos, it seems we really don't need to do anything with it. If people are paying attention to and taking care of performance in their area, then we're covered. To take the example I happen to know best, the JS team uses AWFY to track JS performance on standard benchmarks and additional tests they've decided are useful. So Talos is not needed to track JS performance. Having all the features of the new graph server does sound pretty cool, though. It appears that there a few areas that are only covered by Talos for now, though. I think in that category we have warm startup time
Re: proposal: replace talos with inline tests
On Monday, March 4, 2013 5:42:39 AM UTC-8, Ed Morley wrote: (CCing auto-to...@mozilla.com) jmaher and jhammel will be able to comment more on the talos specifics, but few thoughts off the top of my head: It seems like we're conflating multiple issues here: 1) [talos] is a separate repo from mc And also 1a) Talos itself is a big pain for developers to use and debug regressions in, not to mention add tests to, which they basically don't. It seems that some of this may have changed recently, especially around using the new framework--I haven't used it in a while. I think Talos still does fail on creating tests, though, because lots of things just don't fit its assumptions. 2) [it's a hassle to] test the test to be sure it’s working 3) [it's a hassle to get results] populated in graph server 4) [we need to] come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted Switching from the talos harness to mochitest doesn't fix #2 (we still have to test, and I don't see how it magically becomes any easier without extra work - that could have been applied to talos instead) or #3/#4 (orthogonal problem). It also seems like a brute force way of fixing #1 (we could just check talos into mozilla-central). I think that part was mostly supposed to address (1a). Instead, I think we should be asking: 1) Is the best test framework for performance testing: [a] talos (with improvements), [b] mochitest (with a significant amount of work to make it compatible), or [c] a brand new framework? I think that question doesn't have one answer. For JS, it's clearly something else, but it's not even really a framework--it's just running standard benchmarks. For other areas, there are likely different answers. That's why I was so excited about the self-serve idea. (Interestingly, I got schooled on this subject in a similar vein recently on bug tracking. :-) ) 2) Regardless of framework used, would checking it into mozilla-central improve dev workflow enough to outweigh the downsides (see bug 787200 for history on that discussion)? Thanks for the bug link. It seems like putting Talos itself into m-c has significant disadvantages. I'm not sure what to do with other/new perf tests. 3) Regardless of framework used, how can we make the development/testing/staging cycle less painful? I liked the original proposal a lot for this. 4) Regardless of framework used, who should be responsible for ensuring we actively prune performance tests that are no longer relevant? I gave an idea for how to do this in my reply to the original proposal. I didn't say who would do it, but I was assuming the maintainers/operators of graph-server, with the notion that they would be highly empowered to remove anything that no one asked them to keep or that didn't otherwise have a well-documented, easily understood rationale. Dave ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: proposal: replace talos with inline tests
On Monday, March 4, 2013 5:17:29 PM UTC-8, Gregory Szorc wrote: On 3/4/13 5:09 PM, Dave Mandelin wrote: We already don't back back out changes for regressing a benchmark like we back them out for regressing tests. I think this is at least partially because a general sentiment that not all of our benchmarks correlate strongly to what they're trying to measure. I know this has been a hot topic lately. I think getting more clarity on this would be great, *if* of course we could have an answer that was both operationally beneficial and clear, which seems to be incredibly difficult. But this thread gives me a new idea. If each test run in automation had an owner (as I suggested elsewhere), how about also making the owners responsible for informing the sheriffs about what to do in case of regression? If the owners know the test is reliable and measures something important, they can ask for monitoring and presumptive backout. If not, they can ask sheriffs to ignore the test, inform and coordinate with the owning team, inform the landing person only, or some other action. This should be annotated in the tests themselves, IMO. We could even have said annotation influence the color on TBPL. I like it. We would need to make sure the annotations reflect active consideration by the test owners, but I suppose failures are likely to self-correct. IMO we should be focusing on lessening the burden on the sheriffs and leaving them to focus on real problems. Absolutely. Dave ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform