Hi Cristopher,

HBase starts a minicluster for many of its tests because we have a lot of
destructive tests. Or the non destructive tests would be impacted by the
destructive tests. When writing a client application, you usually don't
need to do that: you can rely on the same instance for all your tests.

As well, it's useful to write the tests in a way compatible with a real
cluster or a pseudo distributed one. Sometimes, when the test fails, you
want to have a look at what the code wrote or found in HBase: you won't
have this in a mini cluster. And it saves a start.

I don't know if there is a blog entry on this; but it's not very difficult
to do (but as usual not that easy when you start). I've personally done it
with a singleton class + prefixing the table names by a random key (to
allow multiple tests in parallel on the same cluster without relying on
cleanup) + getProperty to decide between starting a mini cluster or
connecting to a cluster.

HTH,

Nicolas


On Fri, Aug 31, 2012 at 12:28 PM, Cristofer Weber <
cristofer.we...@neogrid.com> wrote:

> Hi Sonal, Stack and Ulrich!
>
> Yes, I should provide more details :$
>
> I reached the links you provided when I was searching for a way to start
> HBase with JUnit. From default, the only params I have changed are
> Zookeeper port and the amount of nodes, which is 1 in my case. Based on
> logs I suspect that most of time are spent with HDFS and that's why I asked
> if there is a way to start a standalone instance of HBase. The amount of
> data written at each test case would probably fit in memstore anyway, and
> table cleansing between each test method is managed by a loop of deletes.
>
> At least 15 seconds are spent on starting the mini cluster for each test
> case.
>
> Right now I reminded that I should turn off WAL when running unit tests
> :-), but this will not reflect on startup time.
>
> Thanks!!
>
> Best regards,
> Cristofer
>
> ________________________________________
> De: Ulrich Staudinger [ustaudin...@gmail.com]
> Enviado: sexta-feira, 31 de agosto de 2012 2:21
> Para: user@hbase.apache.org
> Assunto: Re: HBase and unit tests
>
> As a general advice, although you probably do take care of this,
> instantiate the mini cluster only once in your junit test constructor
> and not in every test method. at the end of each test, either cleanup
> your hbase or use a different "area" per test.
>
> best regards,
> ulrich
>
>
> --
> connect on xing or linkedin. sent from my tablet.
>
> On 31.08.2012, at 06:46, Stack <st...@duboce.net> wrote:
>
> > On Thu, Aug 30, 2012 at 4:44 PM, Cristofer Weber
> > <cristofer.we...@neogrid.com> wrote:
> >> Hi there!
> >>
> >> After I started studying HBase, I've searched for open source projects
> backed by HBase and I found Titan distributed graph database (you probably
> heard about it). As soon as I read in their documentation that HBase
> adapter is experimental and suboptimal (disclaimer here:
> https://github.com/thinkaurelius/titan/wiki/Using-HBase) I volunteered to
> help improving this adapter and since then I made a few changes to improve
> on running tests (reduced from hours to minutes) and also an improvement on
> search feature.
> >>
> >> Now I'm trying to break the dependency on a pre-installed HBase for
> unit tests and found miniCluster inside HBase tests, but minicluster
> demands too much time to start and I don't know if tweaking on configs will
> improve significantly. Is there a way to start a 'lightweight' instance,
> like programatically starting a standalone instance?
> >>
> >
> > How much is 'too much time' Cristofer?  Do you want a standalone cluster
> at all?
> > St.Ack
> > P.S. If digging in this area, you might find the blog post by the
> > sematextians of use:
> >
> http://blog.sematext.com/2010/08/30/hbase-case-study-using-hbasetestingutility-for-local-testing-development/
>

Reply via email to