Michael Stack wrote:
Andrzej Bialecki wrote:
If I'm not mistaken, there is no way right now to use HBase in a "local" mode similar to the Hadoop "local" mode, where we don't have to start any daemons and all necessary infrastructure runs inside a single JVM. What would it take to implement such mode? Would it require big changes to the codebase?
Checkout MinHBaseCluster. Its used in the bulk of the hbase unit tests. It runs a master and a configurable amount of region servers each to its own thread. Its modeled on MiniDFSCluster.

That's excellent news - I just looked at the code, I think it would require only minimal tweaks to use it together with other Hadoop services running in "local" mode - e.g. it would be more convenient to have the MiniHBaseCluster (or its modified version, let's call it LocalHBaseCluster) handle the startup / shutdown itself, so that the user applications could assume that all necessary services are already running. I'm also going to check what is the startup time of MiniHBaseCluster.

If the default mode -- i.e. if the hbase.master was set to 'local' in hbase-default.xml -- was to run a MiniHBaseCluster instance, would this suffice Andrzej? Or do you need master and regionservers talking to each via direct in-process method invocations rather than over sockets as is done in "local" mapreduce?


Direct in-process pseudo-protocol would be probably more efficient and it would reduce the number of sockets in use, but we could implement it as a future enhancement if needed. For now I'm happy with them using sockets.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to