On Sun, Jan 15, 2012 at 11:39 PM, Henry Robinson <[email protected]> wrote:
> Hi -
>
> The unit tests are taking longer and longer to run, particularly locally. I
> was poking about looking for some easy wins, and I noticed that a lot of
> the time is spent waiting for servers to come up, which is heavily
> dependent on the tick time. Lo and behold, dropping the tick time on (for
> example) QuorumPeerMainTest from 4s to 100ms made the test suite quicker by
> 30s.
>
> On builds.apache.org it's not a great idea to reduce the tick time too far
> because it generally runs on more contended hardware so timeouts get hit,
> but what if we just increase the session expiration time commensurately? We
> could set a 500ms tick time with a 30s (or more) max session expiration
> time. Latencies due to waiting for servers to start should be lower, but
> the tests should remain as stable.
>
> Any thoughts? Any other ways we can tighten up the test suite runtime?

I'd be concerned that we were testing with a different setting than
most users set. Would we be more or less likely to find issues by
setting this lower?

re "other ways":

In the past I've found that test time reductions could be had by
looking at the longest running tests for flaws. Often a test will set
a session time of 30seconds and wait for expiration, or sleep for some
long/unnecessary period of time. I'll typically refactor the test to
improve the runtime. In past releases I've made significant
improvements using this method (perhaps mined out?)

Another option is to restart the server(s) less frequently. This can
be done by starting the service once for all tests in a class, rather
than for each test method. (non-optimal though)

Others would probably point out that what we call "unit tests" are
pretty much system tests and should be moved out. That seems unlikely
at this time however.

Given that tests typically increase in scope (and time) and not
decrease we might want to consider moving to the approach that Pig and
some other projects have. They have test targets that run a subset of
the test suite. For example in Pig "test" takes 6-8 hrs, however they
have a "test-commit" which only takes 20min or so. We could do similar
for ZK. This is easy to do using "exclude files". (see pig build.xml)

IMO long term we should categorize our tests, asking ppl to run
"test-commit" (short subset) prior to committing, whereas "test" (full
suite) would run as part of the patch testing, nightly testing,
release testing, etc...

Patrick

Reply via email to