Dear Nick, Thanks a lot for the nice explanation. I just left yet with a small confusion:
As you told: Your logical partitioning remains the same whether it's being served by N, 2N, or 3.5N+2 RegionServers. When a region is splitted into two, I guess the logical partitioning is changed, right? Could you kindly clarify a bit more. And my question is without making any change to my initial row-key-generation function, how new writes will go to the new regions? I assume it is hard to predict the number of RS initially as well as creating pre-splitted regions in very large-scale production systems. I am not worried about the default load-balancing behaviour of HBase. St.Ack and you've also clearly explained that as well. For an example: as indicated in the Lars George's book where he used <# of RS> while finding the prefix, I guess <# of Regions> could be also used (if regions are pre-splitted) ---------------------------------------------------------------------------------------------------------------------------- *Salting* You can use a salting prefix to the key that guarantees a spread of all rows across all region servers. For example: byte prefix = (byte) (Long.hashCode(timestamp) % <number of region servers>); byte[] rowkey = Bytes.add(Bytes.toBytes(prefix), Bytes.toBytes(timestamp); This formula will generate enough prefix numbers to ensure that rows are sent to all region servers. Of course, the formula assumes a *specific*number of servers, and if you are planning to *grow your cluster* you should set this number to a * multiple* instead. ---------------------------------------------------------------------------------------------------------------------------- Thanks again ... Regards, Joarder Kamal On 3 July 2013 03:37, Nick Dimiduk <[email protected]> wrote: > Hi Joarder, > > I think you're slightly confused about the impact of using a hashed (or > sometimes called "salted") prefix for your rowkeys. This strategy for > rowkey design has an impact on the logical ordering of your data, not > necessarily the physical distribution of your data. In HBase, these are > orthogonal concerns. It means that to execute a bucket-agnostic query, the > client must initiate N scans. However, there's no guarantee that all > regions starting with the same hash land on the same RegionServer. Region > assignment is a complex beast; as I understand, it's based on a randomish, > load-based assignment. > > Take a look at your existing table distributed on a size-N cluster. Do all > regions that fall within the first bucket sit on the same RegionServer? > Likely not. However, look at the number of regions assigned to each > RegionServer. This should be close to even. Adding a new RegionServer to > the cluster will result in some of those regions migrating from the other > servers to the new one. The impact will be a decrease in the average number > of regions served per RegionServer. > > Your logical partitioning remains the same whether it's being served by N, > 2N, or 3.5N+2 RegionServers. Your client always needs to execute that > bucket-agnostic query as N scans, touching each of the N buckets. Precisely > which RegionServers are touched by any given scan depends entirely on how > the balancer has distributed load on your cluster. > > Thanks, > Nick > > On Thu, Jun 27, 2013 at 5:02 PM, Joarder KAMAL <[email protected]> wrote: > > > Thanks St.Ack for mentioning about the load-balancer. > > > > But my question was two folded: > > Case-1. If a new RS is added, then the load-balancer will do it's job > > considering no new region has been created in the meanwhile. // As you've > > already answered. > > > > Case-2. Whether a new RS is added or not, an existing region is splitted > > into two, then how the new writes will to the new region? Because, lets > say > > initially the hash function was calculated with *N* Regions and now there > > are *N+1* Regions in the cluster. > > > > In that case, do I need to change the Hash function and reshuffle all > the > > existing data within the cluster !! Or, HBase has some mechanism to > handle > > this? > > > > > > Many thanks again for helping me out... > > > > > > > > Regards, > > Joarder Kamal > > > > On 28 June 2013 02:26, Stack <[email protected]> wrote: > > > > > On Wed, Jun 26, 2013 at 4:24 PM, Joarder KAMAL <[email protected]> > > wrote: > > > > > > > May be a simple question to answer for the experienced HBase users > and > > > > developers: > > > > > > > > If I use hash partitioning to evenly distribute write workloads into > my > > > > region servers and later add a new region server to scale or split an > > > > existing region, then do I need to change my hash function and > > re-shuffle > > > > all the existing data in between all the region servers (old and > new)? > > > Or, > > > > is there any better solution for this? Any guidance would be very > much > > > > helpful. > > > > > > > > > > You do not need to change your hash function. > > > > > > When you add a new regionserver, the balancer will move some of the > > > existing regions to the new host. > > > > > > St.Ack > > > > > >
