This looks cool! I ran this a few times:
ant test-core -Dtests.seed=0:0:0 -Dtests.cpus=20 -Dtests.directory=RAMDirectory -Dtests.codec=Lucene40 I fixed seed & RAMDir to reduce variance... [junit4] Slave 16: 0.29 .. 24.65 = 24.36s [junit4] Slave 17: 0.36 .. 30.62 = 30.26s [junit4] Slave 18: 0.44 .. 30.84 = 30.41s [junit4] Slave 19: 0.50 .. 28.65 = 28.15s [junit4] Execution time total: 36.69s [junit4] Tests summary: 278 suites, 1550 tests, 3 ignored [junit4] Slave 16: 0.44 .. 29.61 = 29.17s [junit4] Slave 17: 0.55 .. 31.59 = 31.04s [junit4] Slave 18: 0.30 .. 25.85 = 25.54s [junit4] Slave 19: 0.31 .. 32.64 = 32.33s [junit4] Execution time total: 37.12s [junit4] Tests summary: 278 suites, 1550 tests, 3 ignored [junit4] Slave 16: 0.28 .. 25.70 = 25.42s [junit4] Slave 17: 0.23 .. 29.83 = 29.60s [junit4] Slave 18: 0.28 .. 27.50 = 27.22s [junit4] Slave 19: 0.37 .. 27.67 = 27.30s [junit4] Execution time total: 35.23s [junit4] Tests summary: 278 suites, 1550 tests, 1 failure, 3 ignored [junit4] Slave 16: 0.38 .. 28.99 = 28.61s [junit4] Slave 17: 0.41 .. 30.79 = 30.38s [junit4] Slave 18: 0.48 .. 30.05 = 29.57s [junit4] Slave 19: 0.35 .. 30.71 = 30.36s [junit4] Execution time total: 38.46s [junit4] Tests summary: 278 suites, 1550 tests, 3 ignored [junit4] Slave 16: 0.27 .. 29.56 = 29.29s [junit4] Slave 17: 0.44 .. 32.64 = 32.21s [junit4] Slave 18: 0.40 .. 31.99 = 31.60s [junit4] Slave 19: 0.27 .. 32.64 = 32.37s [junit4] Execution time total: 37.70s [junit4] Tests summary: 278 suites, 1550 tests, 3 ignored Does the "Execution time total" include compilation, or is it just the actual test runtime? Can this change run "across" the different groups of tests we have (core, modules/*, contrib/*, solr/*, etc.)? I found that to be a major bottleneck in the current "ant test"'s concurrency, ie we have a pinch point after each group of tests (must wait for all JVMs to finish before moving on to next group...), but I think fixing that in ant is going to be hard? When I use the hacked up Python test runner (runAllTests.py in luceneutil), running only core tests w/ RAMDir and Lucene40 codec it takes ~30 seconds; I think it's doing roughly the same thing as this change (balancing the tests across JVMs). BUT: that's on current trunk, vs your git clone which is somewhat old by now... so it's an apples/pears comparison ;) Mike McCandless http://blog.mikemccandless.com On Fri, Dec 30, 2011 at 3:49 PM, Robert Muir <[email protected]> wrote: > On Fri, Dec 30, 2011 at 3:45 PM, Dawid Weiss > <[email protected]> wrote: >> Thanks Robert. Yes, the variation in certain suites is pretty large -- >> if you open the generated execution times cache you can see the >> timings for each test suite. I've seen differences going into tens of >> seconds depending on the seed (and the environment?). What are your >> timing for ant-based splits? Roughly the same? >> > > I think i got to the bottom of this. Depending upon your seed, 95% of > the time a test gets "RamDirectory" but 5% of the time it gets a > file-system backed implementation. > > Because of this, depending upon environment, test times swing wildly > because of fsync(). For example in the last nightly build we fsynced > over 7,000 times in tests. > > This is really crazy and I want to prolong the life of my SSD: see my > latest comment with a fix on LUCENE-3667. With that patch my times are > no longer swinging wildly. > > (easy way to see what i am talking about: just run tests with > -Dtests.directory=MMapDirectory or something like that) > > -- > lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
