Re: [racket-dev] Performance Lab
A few minutes ago, Robby Findler wrote: It would be fantastic to have this kind of help. Thank you for offering! As Neil said, you'll want to consult with Jay some to avoid duplication of work, but also Eli can help with sending out notifications when pushes to the repo happen. We don't have something that sends out notifications -- instead, drdr polls the push counter here: http://git.racket-lang.org/plt.git/push-counter However, it's easy to configure github to send an http request (or an email, or whatever) on pushes, which happen on each push we do to our repo. Performance testing is a place where I think our automated testing support is especially weak, too. Thanks again! I think that drdr could use some work to be more robust in general (the machine setup), and to work on other platforms (osx, windows). The robustness is particularly important if it's going to send out alerts when performance changes much. And BTW, for performance, it would be nice to alert on all changes, not just when it goes up... -- ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: http://barzilay.org/ Maze is Life! _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] Performance Lab
[Disclaimer: I'm not part of the core team, nor do I play one on TV.] A few miscellaneous thoughts: 1. I love the idea of performance regression testing, with history charts. (I read a great example of this a few months ago, but can't dredge it up right now.) 2. I've wanted to do this on some of my own projects. It would be awesome if the same tools used for Racket itself could be used by individuals for their collections. (Maybe they have to provide their own hardware or EC2 instances). 2.5 Racket will be breaking up its distribution into packages. I think this sorta relates to 2? 3. Recently I was impressed to see how Travis integrates with GitHub. For example if you make a pull request to a Linguist project, Travis tests automatically run and the pass/fail result posted in the pull request thread. (AFAIK Travis isn't _performance_ regression testing, but still.) 4. Similarly, it could be neat if Planet (1 and/or 2) could show a Travis-like test result badge for a package. And even a performance test history chart. We could see at a glance that a package has been trending faster or slower; more information to evaluate it. On Tue, Jan 22, 2013 at 12:25 AM, Curtis Dutton curtd...@gmail.com wrote: I've been using racket now for about 4 years now. I use it for everything that I can and I love it. It is really an awesome system, and I just can't say THANKS enough to all of you for racket. That being said, I'd like to become more active with the development process. In a past life, I worked for Microsoft as a development tools engineer. Most of what I did was develop and operate large scale automated testing systems for very large teams. I once had a room full of 500 or so machines at my disposal, which I tried hard to keep busy. I've maintained rolling build systems, that included acceptance tests, a performance testing system, a stress testing system and a security fuzzing system. I'm not sure how people feel about automated systems like this, part of this email is just to see what people think. But used in the right way they can be used to shape and control the directions that a project evolves in. An example of the type of system that I'd like to see for racket, would be a performance measuring system that would work in principle like so... I have an exampled I'll use. I'm concerned about the racket/openssl transfer speeds. The test: Create 2 places. 1 with a client. 1 with a server. Establish an ssl session. Output a start time event. transfer 1MB of random data. output an end time event Now once I write that test, and commit it, the performance system picks it up from the repository. And it runs that test for every commit that is made there after. That establishes a baseline for the performance of that test. If a commit is made, and suddenly that test takes longer, it generates an alert. At which point, we either investigate to find out why the test slowed down and fix it, or due to circumstances we can't control (which does happen) we tell the system that its acceptable and to accept it as a new baseline. Now of course if there is a marked improvement, we sound out a pat on the back too! Now as a user of this system, I can monitor the performance characteristics of racket that I care about. People can write tests just to track racket's performance over time, and catch unexpected regressions. They can also add these tests before they begin on a campaign of improving their pet measurements. That is the gist of the type of system I wish I had with racket. I can go more into how a stress test works, and perhaps fuzzing tests, etc... Now I'm willing to build it and I'm willing to host it with a number of machines. I have pieces and parts of code lying around and I already have a decent harness implementation that collects statistics about a racket process as it runs. What do you think? If could have something like this, would you want it? (Does something like this exist already?) What would it look like? How would it work, etc I'd like to collect a list of desired tests that this system would monitor for us. If you already have code that you run on your own, even better! Detailed examples would be welcome, as I need to gather some ideas about what people would want to do with this thing. Racket is so awesome! I'd like to help improve it, and I think this is something that I can offer to help get us there. Thanks, Curt _ Racket Developers list: http://lists.racket-lang.org/dev _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] Performance Lab
Hi Curtis, thanks for the offer. Setting up a performance test framework would be fantastic. We may not be ready right now, but I am sure that if someone builds it, they will come. -- Matthias On Jan 22, 2013, at 12:25 AM, Curtis Dutton wrote: I've been using racket now for about 4 years now. I use it for everything that I can and I love it. It is really an awesome system, and I just can't say THANKS enough to all of you for racket. That being said, I'd like to become more active with the development process. In a past life, I worked for Microsoft as a development tools engineer. Most of what I did was develop and operate large scale automated testing systems for very large teams. I once had a room full of 500 or so machines at my disposal, which I tried hard to keep busy. I've maintained rolling build systems, that included acceptance tests, a performance testing system, a stress testing system and a security fuzzing system. I'm not sure how people feel about automated systems like this, part of this email is just to see what people think. But used in the right way they can be used to shape and control the directions that a project evolves in. An example of the type of system that I'd like to see for racket, would be a performance measuring system that would work in principle like so... I have an exampled I'll use. I'm concerned about the racket/openssl transfer speeds. The test: • Create 2 places. 1 with a client. 1 with a server. • Establish an ssl session. • Output a start time event. • transfer 1MB of random data. • output an end time event Now once I write that test, and commit it, the performance system picks it up from the repository. And it runs that test for every commit that is made there after. That establishes a baseline for the performance of that test. If a commit is made, and suddenly that test takes longer, it generates an alert. At which point, we either investigate to find out why the test slowed down and fix it, or due to circumstances we can't control (which does happen) we tell the system that its acceptable and to accept it as a new baseline. Now of course if there is a marked improvement, we sound out a pat on the back too! Now as a user of this system, I can monitor the performance characteristics of racket that I care about. People can write tests just to track racket's performance over time, and catch unexpected regressions. They can also add these tests before they begin on a campaign of improving their pet measurements. That is the gist of the type of system I wish I had with racket. I can go more into how a stress test works, and perhaps fuzzing tests, etc... Now I'm willing to build it and I'm willing to host it with a number of machines. I have pieces and parts of code lying around and I already have a decent harness implementation that collects statistics about a racket process as it runs. What do you think? If could have something like this, would you want it? (Does something like this exist already?) What would it look like? How would it work, etc I'd like to collect a list of desired tests that this system would monitor for us. If you already have code that you run on your own, even better! Detailed examples would be welcome, as I need to gather some ideas about what people would want to do with this thing. Racket is so awesome! I'd like to help improve it, and I think this is something that I can offer to help get us there. Thanks, Curt _ Racket Developers list: http://lists.racket-lang.org/dev _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] Performance Lab
On 01/21/2013 10:25 PM, Curtis Dutton wrote: I've been using racket now for about 4 years now. I use it for everything that I can and I love it. It is really an awesome system, and I just can't say THANKS enough to all of you for racket. That being said, I'd like to become more active with the development process. In a past life, I worked for Microsoft as a development tools engineer. Most of what I did was develop and operate large scale automated testing systems for very large teams. I once had a room full of 500 or so machines at my disposal, which I tried hard to keep busy. I've maintained rolling build systems, that included acceptance tests, a performance testing system, a stress testing system and a security fuzzing system. I'm not sure how people feel about automated systems like this, part of this email is just to see what people think. But used in the right way they can be used to shape and control the directions that a project evolves in. We feel very good about them, and we can always use a lot more. Every push, a machine named DrDr (see drdr.racket-lang.org) rebuilds all of Racket, including GUI stuff and tools, from the updated codebase, and runs almost every file. Many of them are in a directory called tests and subdirectories of top-level collections like plot/tests. DrDr emails the pusher and any responsible parties if tests fail or a file's output changes. Most of these tests are standard, deterministic tests. Some are randomized tests (Typed Racket and the math library do this, and probably Redex). A small number are stress tests and performance tests. Here's where I can't speak for everybody, just myself. It seems to me that most of the testing is ad-hoc, depending on the whims of the collection owners. It could be very nice to standardize most of it. Now I'm willing to build it and I'm willing to host it with a number of machines. I have pieces and parts of code lying around and I already have a decent harness implementation that collects statistics about a racket process as it runs. Jay McCarthy owns DrDr, and runs it from our research lab. You'll want to talk to him about hardware. I'd like to collect a list of desired tests that this system would monitor for us. If you already have code that you run on your own, even better! Detailed examples would be welcome, as I need to gather some ideas about what people would want to do with this thing. I've wanted more automated testing for `math/special-functions' and `math/distributions' exports, and performance testing for `math/array' and `math/matrix', but I haven't gotten around to doing it. I'm not sure what I'd use as a baseline for the latter two. For `math/distributions', I've got no idea how to reliably test a sampler, since there's always a nonzero probability that any computed statistic is wrong. Do you have experience with that? Neil ⊥ _ Racket Developers list: http://lists.racket-lang.org/dev