If your data can all fit on one machine, HBase is not the best choice. I
think you'd be better off using a simpler solution for small data and leave
HBase for use cases that require proper clusters.

On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com> wrote:

> We dont want to invest into another DB like Dynamo, Cassandra and Already
> are in the Hadoop Stack. Managing another DB would be a pain. Why HBase
> over RDMS, is because we call HBase via Spark Streaming to lookup the keys.
>
> Manish
>
> On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
>
> > Hey Manish,
> >
> > Just to ask the naive question, why use HBase if the data fits into such
> a
> > small table?
> >
> > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com
> <javascript:;>> wrote:
> >
> > > Hi,
> > >
> > > We have a scenario where HBase is used like a Key Value Database to map
> > > Keys to Regions. We have over 5 Million Keys, but the table size is
> less
> > > than 7 GB. The read volume is pretty high - About 50x of the put/delete
> > > volume. This causes hot spotting on the Data Node and the region is not
> > > split. We cannot change the maxregionsize parameter as that will impact
> > > other tables too.
> > >
> > > Our idea is to manually inspect the row key ranges and then split the
> > > region manually and assign them to different region servers. We will
> > > continue to then monitor the rows in one region to see if needs to be
> > > split.
> > >
> > > Any experience of doing this on HBase. Is this a recommended approach?
> > >
> > > Thanks,
> > > Manish
> > >
> >
> >
> > --
> > -Dima
> >
>


-- 
-Dima

Reply via email to