Re: [racket-dev] Performance Lab

2013-01-22 Thread Eli Barzilay
A few minutes ago, Robby Findler wrote:
 It would be fantastic to have this kind of help. Thank you for
 offering! As Neil said, you'll want to consult with Jay some to
 avoid duplication of work, but also Eli can help with sending out
 notifications when pushes to the repo happen.

We don't have something that sends out notifications -- instead, drdr
polls the push counter here:

http://git.racket-lang.org/plt.git/push-counter

However, it's easy to configure github to send an http request (or an
email, or whatever) on pushes, which happen on each push we do to our
repo.


 Performance testing is a place where I think our automated testing
 support is especially weak, too. Thanks again!

I think that drdr could use some work to be more robust in general
(the machine setup), and to work on other platforms (osx, windows).
The robustness is particularly important if it's going to send out
alerts when performance changes much.

And BTW, for performance, it would be nice to alert on all changes,
not just when it goes up...

-- 
  ((lambda (x) (x x)) (lambda (x) (x x)))  Eli Barzilay:
http://barzilay.org/   Maze is Life!
_
  Racket Developers list:
  http://lists.racket-lang.org/dev


Re: [racket-dev] Performance Lab

2013-01-22 Thread Greg Hendershott
[Disclaimer: I'm not part of the core team, nor do I play one on TV.]

A few miscellaneous thoughts:

1. I love the idea of performance regression testing, with history
charts. (I read a great example of this a few months ago, but can't
dredge it up right now.)

2. I've wanted to do this on some of my own projects. It would be
awesome if the same tools used for Racket itself could be used by
individuals for their collections. (Maybe they have to provide their
own hardware or EC2 instances).

2.5 Racket will be breaking up its distribution into packages. I think
this sorta relates to 2?

3. Recently I was impressed to see how Travis integrates with GitHub.
For example if you make a pull request to a Linguist project, Travis
tests automatically run and the pass/fail result posted in the pull
request thread. (AFAIK Travis isn't _performance_ regression testing,
but still.)

4. Similarly, it could be neat if Planet (1 and/or 2) could show a
Travis-like test result badge for a package. And even a performance
test history chart. We could see at a glance that a package has been
trending faster or slower; more information to evaluate it.


On Tue, Jan 22, 2013 at 12:25 AM, Curtis Dutton curtd...@gmail.com wrote:
 I've been using racket now for about 4 years now. I use it for everything
 that I can and I love it. It is really an awesome system, and I just can't
 say THANKS enough to all of you for racket.

 That being said, I'd like to become more active with the development
 process. In a past life, I worked for Microsoft as a development tools
 engineer. Most of what I did was develop and operate large scale automated
 testing systems for very large teams. I once had a room full of 500 or so
 machines at my disposal, which I tried hard to keep busy. I've maintained
 rolling build systems, that included acceptance tests, a performance testing
 system, a stress testing system and a security fuzzing system.

 I'm not sure how people feel about automated systems like this, part of this
 email is just to see what people think. But used in the right way they can
 be used to shape and control the directions that a project evolves in.


 An example of the type of system that I'd like to see for racket, would be a
 performance measuring system that would work in principle like so...

 I have an exampled I'll use. I'm concerned about the racket/openssl transfer
 speeds.

 The test:

  Create 2 places. 1 with a client. 1 with a server.
  Establish an ssl session.
  Output a start time event.
  transfer 1MB of random data.
  output an end time event


 Now once I write that test, and commit it, the performance system picks it
 up from the repository. And it runs that test for every commit that is made
 there after. That establishes a baseline for the performance of that test.
 If a commit is made, and suddenly that test takes longer, it generates an
 alert. At which point, we either investigate to find out why the test slowed
 down and fix it, or due to circumstances we can't control (which does
 happen) we tell the system that its acceptable and to accept it as a new
 baseline. Now of course if there is a marked improvement, we sound out a pat
 on the back too!

 Now as a user of this system, I can monitor the performance characteristics
 of racket that I care about. People can write tests just to track racket's
 performance over time, and catch unexpected regressions. They can also add
 these tests before they begin on a campaign of improving their pet
 measurements.


 That is the gist of the type of system I wish I had with racket.

 I can go more into how a stress test works, and perhaps fuzzing tests,
 etc...


 Now I'm willing to build it and I'm willing to host it with a number of
 machines. I have pieces and parts of code lying around and I already have a
 decent harness implementation that collects statistics about a racket
 process as it runs.


 What do you think? If could have something like this, would you want it?
 (Does something like this exist already?) What would it look like? How would
 it work, etc


 I'd like to collect a list of desired tests that this system would monitor
 for us. If you already have code that you run on your own, even better!
 Detailed examples would be welcome, as I need to gather some ideas about
 what people would want to do with this thing.

 Racket is so awesome! I'd like to help improve it, and I think this is
 something that I can offer to help get us there.

 Thanks,
 Curt











 _
   Racket Developers list:
   http://lists.racket-lang.org/dev

_
  Racket Developers list:
  http://lists.racket-lang.org/dev


Re: [racket-dev] Performance Lab

2013-01-22 Thread Matthias Felleisen

Hi Curtis, thanks for the offer. Setting up a performance test framework would 
be fantastic. We may not be ready right now, but I am sure that if someone 
builds it, they will come. -- Matthias






On Jan 22, 2013, at 12:25 AM, Curtis Dutton wrote:

 I've been using racket now for about 4 years now. I use it for everything 
 that I can and I love it. It is really an awesome system, and I just can't 
 say THANKS enough to all of you for racket.
 
 That being said, I'd like to become more active with the development process. 
 In a past life, I worked for Microsoft as a development tools engineer. Most 
 of what I did was develop and operate large scale automated testing systems 
 for very large teams. I once had a room full of 500 or so machines at my 
 disposal, which I tried hard to keep busy. I've maintained rolling build 
 systems, that included acceptance tests, a performance testing system, a 
 stress testing system and a security fuzzing system.
 
 I'm not sure how people feel about automated systems like this, part of this 
 email is just to see what people think. But used in the right way they can be 
 used to shape and control the directions that a project evolves in.
 
 
 An example of the type of system that I'd like to see for racket, would be a 
 performance measuring system that would work in principle like so...
 
 I have an exampled I'll use. I'm concerned about the racket/openssl transfer 
 speeds.
 
 The test:
   •  Create 2 places. 1 with a client. 1 with a server. 
   •  Establish an ssl session. 
   •  Output a start time event. 
   •  transfer 1MB of random data. 
   •  output an end time event
 
 Now once I write that test, and commit it, the performance system picks it up 
 from the repository. And it runs that test for every commit that is made 
 there after. That establishes a baseline for the performance of that test. 
 If a commit is made, and suddenly that test takes longer, it generates an 
 alert. At which point, we either investigate to find out why the test slowed 
 down and fix it, or due to circumstances we can't control (which does happen) 
 we tell the system that its acceptable and to accept it as a new baseline. 
 Now of course if there is a marked improvement, we sound out a pat on the 
 back too!
 
 Now as a user of this system, I can monitor the performance characteristics 
 of racket that I care about. People can write tests just to track racket's 
 performance over time, and catch unexpected regressions. They can also add 
 these tests before they begin on a campaign of improving their pet 
 measurements.
 
 
 That is the gist of the type of system I wish I had with racket. 
 
 I can go more into how a stress test works, and perhaps fuzzing tests, etc...
 
 
 Now I'm willing to build it and I'm willing to host it with a number of 
 machines. I have pieces and parts of code lying around and I already have a 
 decent harness implementation that collects statistics about a racket process 
 as it runs.
 
 
 What do you think? If could have something like this, would you want it? 
 (Does something like this exist already?) What would it look like? How would 
 it work, etc
 
 
 I'd like to collect a list of desired tests that this system would monitor 
 for us. If you already have code that you run on your own, even better! 
 Detailed examples would be welcome, as I need to gather some ideas about what 
 people would want to do with this thing.
 
 Racket is so awesome! I'd like to help improve it, and I think this is 
 something that I can offer to help get us there.
 
 Thanks,
 Curt
 
 
 
 
 
 
 
 
 
 
 _
  Racket Developers list:
  http://lists.racket-lang.org/dev


_
  Racket Developers list:
  http://lists.racket-lang.org/dev


Re: [racket-dev] Performance Lab

2013-01-21 Thread Neil Toronto

On 01/21/2013 10:25 PM, Curtis Dutton wrote:

I've been using racket now for about 4 years now. I use it for
everything that I can and I love it. It is really an awesome system, and
I just can't say THANKS enough to all of you for racket.

That being said, I'd like to become more active with the development
process. In a past life, I worked for Microsoft as a development tools
engineer. Most of what I did was develop and operate large scale
automated testing systems for very large teams. I once had a room full
of 500 or so machines at my disposal, which I tried hard to keep busy.
I've maintained rolling build systems, that included acceptance tests, a
performance testing system, a stress testing system and a security
fuzzing system.

I'm not sure how people feel about automated systems like this, part of
this email is just to see what people think. But used in the right way
they can be used to shape and control the directions that a project
evolves in.


We feel very good about them, and we can always use a lot more.

Every push, a machine named DrDr (see drdr.racket-lang.org) rebuilds 
all of Racket, including GUI stuff and tools, from the updated codebase, 
and runs almost every file. Many of them are in a directory called 
tests and subdirectories of top-level collections like plot/tests. 
DrDr emails the pusher and any responsible parties if tests fail or a 
file's output changes.


Most of these tests are standard, deterministic tests. Some are 
randomized tests (Typed Racket and the math library do this, and 
probably Redex). A small number are stress tests and performance tests.


Here's where I can't speak for everybody, just myself. It seems to me 
that most of the testing is ad-hoc, depending on the whims of the 
collection owners. It could be very nice to standardize most of it.



Now I'm willing to build it and I'm willing to host it with a number of
machines. I have pieces and parts of code lying around and I already
have a decent harness implementation that collects statistics about a
racket process as it runs.


Jay McCarthy owns DrDr, and runs it from our research lab. You'll want 
to talk to him about hardware.



I'd like to collect a list of desired tests that this system would
monitor for us. If you already have code that you run on your own, even
better! Detailed examples would be welcome, as I need to gather some
ideas about what people would want to do with this thing.


I've wanted more automated testing for `math/special-functions' and 
`math/distributions' exports, and performance testing for `math/array' 
and `math/matrix', but I haven't gotten around to doing it. I'm not sure 
what I'd use as a baseline for the latter two. For `math/distributions', 
I've got no idea how to reliably test a sampler, since there's always a 
nonzero probability that any computed statistic is wrong. Do you have 
experience with that?


Neil ⊥

_
 Racket Developers list:
 http://lists.racket-lang.org/dev