Hi Pradheep, customerid+type+orderid as rowkey should be able to support range scan on basis of multiple customer rows at high scale. I dont think you need to do salting unless i am missing something here. Salting is usually used to avoid hot-spotting when hbase read/write are incremental rowkeys(non-random). Example: Timeseries data with time as leading part of rowkey Another way to avoid salting with incremental rowkey is to reverse the leading number of your rowkey. example: reverse(45668) = 86654.
HTH, Anil Gupta On Fri, Sep 8, 2017 at 10:23 AM, Pradheep Shanmugam < pradheep.shanmu...@infor.com> wrote: > HI James, > > > We have a table where multiple customer could have rows. > > Some of them may be large and some very small in terms for number of rows. > > we have a row key based on customerid+type+orderid..if not salted all the > rows of large customer will end up in some regions leading to hot > spotting(being large customer and more frequently used) > > > Thanks, > > Pradheep > ------------------------------ > *From:* James Taylor <jamestay...@apache.org> > *Sent:* Friday, September 8, 2017 12:56:31 PM > *To:* user > *Subject:* Re: Salt Number > > Hi Pradheep, > Would you be able to describe your use case and why you're salting? We > really only recommend salting if you have write hotspotting. Otherwise, it > increases the overall load on your cluster. > Thanks, > James > > On Fri, Sep 8, 2017 at 9:13 AM, Pradheep Shanmugam < > pradheep.shanmu...@infor.com> wrote: > >> Hi, >> >> >> As the salt number cannot be changed later, what is is best number we can >> give in different cases for cluster with 10 region servers with say 6 cores >> in each. >> >> Should we consider cores while deciding the number.. >> >> In some places i see number can be in the range 1-256 and in some place i >> see that it is equal to the number of region servers..can the number in the >> multiples of region server(say 20, 30 etc) >> >> >> read heavy large(several 100 millions) table with range scans >> >> write heavy large table with less frequent range scans >> >> large table with hybrid load with range scans >> >> >> Thanks, >> >> Pradheep >> > > -- Thanks & Regards, Anil Gupta