Hi Steve, I bumped in to the same issue in wanting to test Map Reduce jobs that read from and write to HBase tables more quickly. What I ended up doing is creating custom InputFormat and OutputFormat implementations that wrap the TableInputFormat / TableOutputFormat and convert the hbase data to/from object representations. Then the map / reduce classes expect the objects directly. Then I have a test input and output format that provide those objects directly from memory and when testing the jobs configure them to use these test input formats. The tests can then run faster. Of course, you need to be careful to test your input / output formats separately as well as the object / hbase conversion code.
I'm also interested in any techniques other people use to speed up tests that need to interact with HBase. Dave On Thu, Dec 3, 2009 at 3:10 PM, Steve Kuo <[email protected]> wrote: > I have a class that does a regular map job and a TableReduce based reduce > job. This class works when called as the main class either from eclipse or > on my pseudo cluster as long as hbase is up and running. I like to write a > unit test for it and like advices on the best way to proceed. > > The best I came up with after googling for "hbase unit test" is a page that > suggest looking at org.apache.hadoop.hbase.TestTableMapReduce. I was able > to get this class to run after adding additional classes in: > > * org.apache.hadoop.hbase > * org.apache.hadoop.hdfs > * org.apache.hadoop.hdfs.server.datanode > * org.apache.hadoop.mapred > * org.apahce.hadoop.net > * jetty-6.1.14.jar > > After all this, the test worked but it was very slow as it had to start up > mini-cluster for dfs and etc. It seemed excessive that jetty was needed. > > Please advise on whether there is a simpler way to do unit test. > > Thanks in advance. > > > >
