If your data can all fit on one machine, HBase is not the best choice. I think you'd be better off using a simpler solution for small data and leave HBase for use cases that require proper clusters.
On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com> wrote: > We dont want to invest into another DB like Dynamo, Cassandra and Already > are in the Hadoop Stack. Managing another DB would be a pain. Why HBase > over RDMS, is because we call HBase via Spark Streaming to lookup the keys. > > Manish > > On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com > <javascript:;>> wrote: > > > Hey Manish, > > > > Just to ask the naive question, why use HBase if the data fits into such > a > > small table? > > > > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com > <javascript:;>> wrote: > > > > > Hi, > > > > > > We have a scenario where HBase is used like a Key Value Database to map > > > Keys to Regions. We have over 5 Million Keys, but the table size is > less > > > than 7 GB. The read volume is pretty high - About 50x of the put/delete > > > volume. This causes hot spotting on the Data Node and the region is not > > > split. We cannot change the maxregionsize parameter as that will impact > > > other tables too. > > > > > > Our idea is to manually inspect the row key ranges and then split the > > > region manually and assign them to different region servers. We will > > > continue to then monitor the rows in one region to see if needs to be > > > split. > > > > > > Any experience of doing this on HBase. Is this a recommended approach? > > > > > > Thanks, > > > Manish > > > > > > > > > -- > > -Dima > > > -- -Dima