Also, any test that extends from HBaseClusterTestCase needs to be refactored. This class is already deprecated but it won't be able to take advantage of such a factory because it does the cluster setup differently.
E.g.,... public class TestGetRowVersions extends HBaseClusterTestCase { On 9/19/11 4:20 PM, "Jesse Yates" <jesse.k.ya...@gmail.com> wrote: >Cool. > > I'm going to try to get a patch up of the failsafe and package splits up >tonight. > >On Mon, Sep 19, 2011 at 1:18 PM, Doug Meil ><doug.m...@explorysmedical.com>wrote: > >> >> I'm glad you like the factory idea - it's the only thing I can think of >>to >> keep the same test infrastructure (I.e., not completely re-write >> everything), but still address a sizable performance problem. I'm >>working >> on a prototype of it now, I'l attach to a Jira. >> >> >> >> >> On 9/19/11 4:06 PM, "Jesse Yates" <jesse.k.ya...@gmail.com> wrote: >> >> >On Mon, Sep 19, 2011 at 12:18 PM, Doug Meil >> ><doug.m...@explorysmedical.com>wrote: >> > >> >> >> >> I'm not too familiar with the Maven Failsafe plugin, but I've been >> >> reviewing the timings of some 'client' unit tests and where the unit >> >>test >> >> framework spends it's time... >> >> >> >> Anything that uses HBaseTestingUtility and does something like... >> >> >> >> protected void setUp() throws Exception { >> >> TEST_UTIL.startMiniCluster(1); >> >> TEST_UTIL.createTable(TABLENAME, >> >> HConstants.CATALOG_FAMILY); >> >> } >> >> >> >> protected void tearDown() throws IOException { >> >> TEST_UTIL.shutdownMiniCluster(); >> >> } >> >> >> >> ... will spent (at least on my laptop) about 10.7 seconds setting up >>the >> >> cluster, and 7.3 seconds tearing down. Assuming that we aren't >>running >> >>in >> >> a separate JVM each test invocation, sharing the same instance of >> >> HBaseTestingUtility would be an enormous benefit. >> >> >> >> >> >> In terms of startup costs... It's all right here... >> >> >> >> >> >> this.hbaseCluster = new MiniHBaseCluster(c, numMasters, numSlaves); >> >> >> >> >> >> // Don't leave here till we've done a successful scan of the >>.META. >> >> HTable t = new HTable(c, HConstants.META_TABLE_NAME); >> >> ResultScanner s = t.getScanner(new Scan()); >> >> while (s.next() != null) { >> >> continue; >> >> } >> >> >> >> >> >> ... MiniHBaseCluster starts up via the init method in the >>constructor, >> >> and the following code sits and spins until it's ready. >> >> >> > >> >Wow, I thought that might be happening. That is really rough. >> > >> > >> >> As I see it the required actions are: >> >> >> >> 1) Re-use JVMs between test runs (if we are already doing this, >>then no >> >> action) >> >> >> > >> >This is already built into maven when we run tests normally. This >>leads to >> >some interesting situations for messing around with class >>loading/global >> >variables; if you don't reset the environment, it can cause other >> >(essentially non-broken) tests to fail. Unless we are running things >>in >> >forked mode . >> > >> >Forked mode could also cut down on the straight unit test time, but in >>my >> >experience this has lead to difficulty in figuring out what crashed. >>For >> >some reason maven doesn't realize which tests failed, just that they >>did, >> >in >> >its final output (though it still stdouts "FAILURE"). >> > >> > >> >> 2) Re-use the same HBaseTestingUtility instance for all possible >>tests. >> >> >> >> I think #2 isn't "that bad" if we had a factory method to get/return >> >> HBaseTestingUtility instance and internally this factory could do a >> >> ref-count and auto-detect of non-activity, then it could shutdown >>the >> >> cluster behind the scenes. >> >> >> >> >> >+1 on this idea. >> >We should also look into adding more/less region servers whenever >>people >> >request more. We could just leave it up for the entire span of the >> >integration tests and then bring it down afterwards (to make it easier >>on >> >ourselves. >> > >> >I'm pretty sure that we running all the tests consecutively right now, >> >rather than forking them out. >> > >> >-Jesse Yates >> >>