Michael Stack wrote:
Andrzej Bialecki wrote:
If I'm not mistaken, there is no way right now to use HBase in a
"local" mode similar to the Hadoop "local" mode, where we don't have
to start any daemons and all necessary infrastructure runs inside a
single JVM. What would it take to implement such mode? Would it
require big changes to the codebase?
Checkout MinHBaseCluster. Its used in the bulk of the hbase unit
tests. It runs a master and a configurable amount of region servers
each to its own thread. Its modeled on MiniDFSCluster.
That's excellent news - I just looked at the code, I think it would
require only minimal tweaks to use it together with other Hadoop
services running in "local" mode - e.g. it would be more convenient to
have the MiniHBaseCluster (or its modified version, let's call it
LocalHBaseCluster) handle the startup / shutdown itself, so that the
user applications could assume that all necessary services are already
running. I'm also going to check what is the startup time of
MiniHBaseCluster.
If the default
mode -- i.e. if the hbase.master was set to 'local' in hbase-default.xml
-- was to run a MiniHBaseCluster instance, would this suffice Andrzej?
Or do you need master and regionservers talking to each via direct
in-process method invocations rather than over sockets as is done in
"local" mapreduce?
Direct in-process pseudo-protocol would be probably more efficient and
it would reduce the number of sockets in use, but we could implement it
as a future enhancement if needed. For now I'm happy with them using
sockets.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com