Thanks Ted for links links will help to determine how region split what should be the size etc which will really helpful but can you correct me if I am not wrong does my understanding was correct as I asked in trailing mail? I know what will be the salt based on my Mobile number coming in my data So assume for mobile number 9999999999 is # so my rowkey is #_9999999999 As i know in advance what is my exact rowkey i can distribute my data on cluster to avoid HOTSpoting and i want to distribute my data equally on cluster So it is mandatory condition to create table according to my splits?
Thanks Manjeet On Sat, Sep 10, 2016 at 6:26 AM, Ted Yu <[email protected]> wrote: > Please take a look at: > > http://hbase.apache.org/book.html#table_schema_rules_of_thumb > http://hbase.apache.org/book.html#arch.regions.size > http://hbase.apache.org/book.html#ops.capacity.regions > http://hbase.apache.org/book.html#ops.capacity.regions.total > > On Fri, Sep 9, 2016 at 5:35 PM, Manjeet Singh <[email protected]> > wrote: > > > Yeah its in weekdays > > Yeah default is 10 gb so what is the way/forumla to knw what shuld be the > > size of RS > > On 9 Sep 2016 19:03, "Ted Yu" <[email protected]> wrote: > > > > > Can you clarify whether the incoming data rate is for weekdays ? > > > > > > At 6-7 Gb /Hour, you need to set larger region size. > > > Default is 10GB. > > > > > > If you know roughly how the key space would be filled, presplit your > > table > > > accordingly. > > > > > > On Thu, Sep 8, 2016 at 11:24 PM, Manjeet Singh < > > [email protected] > > > > > > > wrote: > > > > > > > Hi All > > > > > > > > I have some basic question can anyone help me out > > > > > > > > Q1. this is my understanding To perform splitting I need to create > > table > > > > like below > > > > create 'test_table','c1', SPLITS=>['#", '!', '$''] > > > > > > > > and I have to design row key in this way > > > > #_123456789 > > > > !_123456789 > > > > $_123456789 > > > > > > > > so my data distributed on cluster > > > > > > > > My requirement is very simple I want to equally distributed data on > > > regions > > > > as per my rowkey only > > > > > > > > So please correct me if I am missing any thing? > > > > > > > > > > > > Q2 If i have 5 regions on my each region server and I give 100 MB > space > > > by > > > > using hbase.hregion.max.filesize property > > > > > > > > what will happen when my all regions fill with 100 MB data > > > > Please note I have cron job secluded on every weekend and my Incoming > > > data > > > > rate is 6-7 Gb /Hour. so my region get filled very fast > > > > > > > > > > > > > > > > > > > > > > > > Thanks > > > > Manjeet > > > > > > > > > > > > > > > > > > > > -- > > > > luv all > > > > > > > > > > -- luv all
