I've pushed back on use of sleep in non-deterministic ways in the past. I think we do a reasonable job there, just grepping for sleep doesn't tell the story.
Where you run into issues is when you do x sleep(500) check x was success Most of our use of sleep has migrated to do x for (1 to 120) // or check elapsed time and cap at some large number sleep(500) // make sufficiently small that you don't waste time waiting unnecessarily, but also not too short that you spin check x was success unless we were able to make due without a time bound at all - sometimes we migrate to a latch or something. Now it's been a while since I reviewed the tests, new code might have added some bad checks again, it's a tough one to stamp out entirely. Re tests taking too long, I can't seem to find the jira, but iirc Henry had created a jira around reducing the tick time for tests - that significantly reduced the setup time for quorum based tests - a big part of overall overhead. We should probably categorize our tests and run a subset outside of a nightly "full test run". Patrick On Sun, May 3, 2015 at 9:49 PM, Raúl Gutiérrez Segalés <r...@itevenworks.net> wrote: > Hi, > > On 3 May 2015 at 12:53, Chris Nauroth <cnaur...@hortonworks.com> wrote: > > > (....) > > 3. Tests are non-deterministic, such as by hard-coding a sleep time to > > wait for an asynchronous action to complete. The solutions usually > > involve providing hooks into lower-layer logic, such as to receive a > > callback from the asynchronous action, so that the test can be > > deterministic. > > > > Indeed: > > ~/src/zookeeper-svn/src/java/test/org/apache/zookeeper (master) ✔ git grep > -i 'sleep(' | wc -l > 91 > > Making runs shorter would be very helpful as well. Currently it just takes > too long. > > Also, adding to what Patrick said, I'll take a closer look at the runs > reported at: > > https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/ > > to have a better grasp of what's going on. Thanks! > > > -rgs >