Chiming in late here. You might be better served with a database that was designed to run on a single host. HBase will run in standalone mode for testing purposes, but then it uses a local wrapper for the filesystem, which does not (I think) support the sync API, and hence HBase has no way to guarantee changes on disk and machine outage can corrupt your data (I faintly remember some discussion about this, so this might have changed). You could run a single node HDFS and a single node HBase on the same machine, but then there are simpler databases to run.
MySQL was mentioned here. PostgreSQL has good blob and even key-value support. Might want to to try Redis as well; and there are many more single node type key value stores. Many of the design choices for HBase were made with scalability in mind. -- Lars ________________________________ From: Arun Allamsetty <[email protected]> To: [email protected] Sent: Tuesday, July 8, 2014 12:55 AM Subject: Using HBase in standalone mode in production Hi all, So this question might be stupid, retarded even, but it has been bugging me for a while and I cannot think of a better place to ask this. I am really impressed with the way HBase works (as a key-value store). Since it stores everything as a byte array, I find it really convenient to store serialized objects. Also, I understand that HBase is supposed to be used when you have too much data to be handled by a single machine, so we can scale our application by running it in distributed mode. But what if I want to use it because its HashMap kind of capabilities with an added feature to track versions. Is it recommended that I use it for a small application (in standalone mode) with maybe 100K users and storage needs which probably won't exceed 100G. I know it is never recommended to be used as a transactional database (I have read that in a million places) but I would like to know more about it. Thanks, Arun
