On Tue, Mar 3, 2015 at 7:32 AM, Rose, Joseph < [email protected]> wrote:
> Folks, > > I’m new to HBase (but not new to these sorts of data stores.) I think > HBase would be a good fit for a project I’m working on, except for one > thing: the amount of data we’re talking about, here, is far smaller than > what’s usually recommended for HBase. As I read the docs, though, it seems > like the main argument against small datasets is replication: HDFS requires > a bunch of nodes right from the start and that’s overkill for my use. > > Why not use an RDBMS then? > So, what’s the motivation behind labeling standalone HBase deployments > “dev only”? If all I really need is a table full of keys and all of that > will fit comfortably in a single node, and if I have my own backup solution > (literally, backing up the VM on which it’ll run), why bother with HDFS and > distributed HBase? > > (As an aside, I could go to something like Berkeley DB but then I don’t > get all the nice coprocessors and filters and so on, not to mention > cell-level security. Because I work with patient data the latter is > definitely a huge win.) > > What Nick said. Standalone and 'throwaway' are usually found in the same sentence so little consideration (testing/verification) has been done to ensure it works well. That said, it basically works and I know of at least one instance where a standalone instance is hosting tsdb for a decent-sized cluster. St.Ack > Thanks for your help. > > > Joseph Rose > Intelligent Health Laboratory > Boston Children’s Hospital > >
