(Though if it is only 7 GB, why not just store it in memory?)

On Sunday, August 28, 2016, Dima Spivak <dspi...@cloudera.com> wrote:

> If your data can all fit on one machine, HBase is not the best choice. I
> think you'd be better off using a simpler solution for small data and leave
> HBase for use cases that require proper clusters.
>
> On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com
> <javascript:_e(%7B%7D,'cvml','mylogi...@gmail.com');>> wrote:
>
>> We dont want to invest into another DB like Dynamo, Cassandra and Already
>> are in the Hadoop Stack. Managing another DB would be a pain. Why HBase
>> over RDMS, is because we call HBase via Spark Streaming to lookup the
>> keys.
>>
>> Manish
>>
>> On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com>
>> wrote:
>>
>> > Hey Manish,
>> >
>> > Just to ask the naive question, why use HBase if the data fits into
>> such a
>> > small table?
>> >
>> > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > We have a scenario where HBase is used like a Key Value Database to
>> map
>> > > Keys to Regions. We have over 5 Million Keys, but the table size is
>> less
>> > > than 7 GB. The read volume is pretty high - About 50x of the
>> put/delete
>> > > volume. This causes hot spotting on the Data Node and the region is
>> not
>> > > split. We cannot change the maxregionsize parameter as that will
>> impact
>> > > other tables too.
>> > >
>> > > Our idea is to manually inspect the row key ranges and then split the
>> > > region manually and assign them to different region servers. We will
>> > > continue to then monitor the rows in one region to see if needs to be
>> > > split.
>> > >
>> > > Any experience of doing this on HBase. Is this a recommended approach?
>> > >
>> > > Thanks,
>> > > Manish
>> > >
>> >
>> >
>> > --
>> > -Dima
>> >
>>
>
>
> --
> -Dima
>
>

-- 
-Dima

Reply via email to