On Sun, Jan 15, 2012 at 11:39 PM, Henry Robinson <[email protected]> wrote: > Hi - > > The unit tests are taking longer and longer to run, particularly locally. I > was poking about looking for some easy wins, and I noticed that a lot of > the time is spent waiting for servers to come up, which is heavily > dependent on the tick time. Lo and behold, dropping the tick time on (for > example) QuorumPeerMainTest from 4s to 100ms made the test suite quicker by > 30s. > > On builds.apache.org it's not a great idea to reduce the tick time too far > because it generally runs on more contended hardware so timeouts get hit, > but what if we just increase the session expiration time commensurately? We > could set a 500ms tick time with a 30s (or more) max session expiration > time. Latencies due to waiting for servers to start should be lower, but > the tests should remain as stable. > > Any thoughts? Any other ways we can tighten up the test suite runtime?
I'd be concerned that we were testing with a different setting than most users set. Would we be more or less likely to find issues by setting this lower? re "other ways": In the past I've found that test time reductions could be had by looking at the longest running tests for flaws. Often a test will set a session time of 30seconds and wait for expiration, or sleep for some long/unnecessary period of time. I'll typically refactor the test to improve the runtime. In past releases I've made significant improvements using this method (perhaps mined out?) Another option is to restart the server(s) less frequently. This can be done by starting the service once for all tests in a class, rather than for each test method. (non-optimal though) Others would probably point out that what we call "unit tests" are pretty much system tests and should be moved out. That seems unlikely at this time however. Given that tests typically increase in scope (and time) and not decrease we might want to consider moving to the approach that Pig and some other projects have. They have test targets that run a subset of the test suite. For example in Pig "test" takes 6-8 hrs, however they have a "test-commit" which only takes 20min or so. We could do similar for ZK. This is easy to do using "exclude files". (see pig build.xml) IMO long term we should categorize our tests, asking ppl to run "test-commit" (short subset) prior to committing, whereas "test" (full suite) would run as part of the patch testing, nightly testing, release testing, etc... Patrick
