I'm glad you like the factory idea - it's the only thing I can think of to keep the same test infrastructure (I.e., not completely re-write everything), but still address a sizable performance problem. I'm working on a prototype of it now, I'l attach to a Jira.
On 9/19/11 4:06 PM, "Jesse Yates" <jesse.k.ya...@gmail.com> wrote: >On Mon, Sep 19, 2011 at 12:18 PM, Doug Meil ><doug.m...@explorysmedical.com>wrote: > >> >> I'm not too familiar with the Maven Failsafe plugin, but I've been >> reviewing the timings of some 'client' unit tests and where the unit >>test >> framework spends it's time... >> >> Anything that uses HBaseTestingUtility and does something like... >> >> protected void setUp() throws Exception { >> TEST_UTIL.startMiniCluster(1); >> TEST_UTIL.createTable(TABLENAME, >> HConstants.CATALOG_FAMILY); >> } >> >> protected void tearDown() throws IOException { >> TEST_UTIL.shutdownMiniCluster(); >> } >> >> ... will spent (at least on my laptop) about 10.7 seconds setting up the >> cluster, and 7.3 seconds tearing down. Assuming that we aren't running >>in >> a separate JVM each test invocation, sharing the same instance of >> HBaseTestingUtility would be an enormous benefit. >> >> >> In terms of startup costs... It's all right here... >> >> >> this.hbaseCluster = new MiniHBaseCluster(c, numMasters, numSlaves); >> >> >> // Don't leave here till we've done a successful scan of the .META. >> HTable t = new HTable(c, HConstants.META_TABLE_NAME); >> ResultScanner s = t.getScanner(new Scan()); >> while (s.next() != null) { >> continue; >> } >> >> >> ... MiniHBaseCluster starts up via the init method in the constructor, >> and the following code sits and spins until it's ready. >> > >Wow, I thought that might be happening. That is really rough. > > >> As I see it the required actions are: >> >> 1) Re-use JVMs between test runs (if we are already doing this, then no >> action) >> > >This is already built into maven when we run tests normally. This leads to >some interesting situations for messing around with class loading/global >variables; if you don't reset the environment, it can cause other >(essentially non-broken) tests to fail. Unless we are running things in >forked mode . > >Forked mode could also cut down on the straight unit test time, but in my >experience this has lead to difficulty in figuring out what crashed. For >some reason maven doesn't realize which tests failed, just that they did, >in >its final output (though it still stdouts "FAILURE"). > > >> 2) Re-use the same HBaseTestingUtility instance for all possible tests. >> >> I think #2 isn't "that bad" if we had a factory method to get/return >> HBaseTestingUtility instance and internally this factory could do a >> ref-count and auto-detect of non-activity, then it could shutdown the >> cluster behind the scenes. >> >> >+1 on this idea. >We should also look into adding more/less region servers whenever people >request more. We could just leave it up for the entire span of the >integration tests and then bring it down afterwards (to make it easier on >ourselves. > >I'm pretty sure that we running all the tests consecutively right now, >rather than forking them out. > >-Jesse Yates