On Tue, Jun 14, 2011 at 11:49 PM, Stack <[email protected]> wrote: > On Tue, Jun 14, 2011 at 1:41 AM, Shuja Rehman <[email protected]> > wrote: > > Well...There are couple of reasons > > > > 1- The data is coming from different regions of country and i want to > > distribute the data w.r.t regions. e.g > > RegionServer1-RegsionServer4 contain east region data only. > > RegionServer2-RegionServer6 contain west region data only. > > > > Can you do this with a table per region? Otherwise, prefix the key w/ > region. This won't be perfect in that the boundary won't be clean but > perhaps sufficient? > > hum...i think table per region will not work as in future, there will be data coming from different countries and if i use this strategy then it means i need to create lot of tables for this which does not seem suitable to me. I also think to prefix the key with region but i have many other things in the key also e.g timestamp, tags and i am not sure how hbase distribute the data to region servers in the presence of these things in the key.
> > > 2- The cluster is combination of different machines w.r.t hardware (RAM, > > Processor Speed,Number of Cores). Some tables are access frequently and > some > > access for fewer time so i want to place the most accessed tables on the > > machines with highest RAM and processing speeds. e.g create table1, > colFam1 > > @10.10.10.2,10.10.10.3,10.10.10.10.4 (list of region servers) > > > > > In general, a heterogeneous cluster is probably going to cause you > headache; rare has hbase run on a cluster that was not homogeneous so > my guess is that you'll run into 'interesting' issues. > > Currently the levers are not exposed for manually balancing the > cluster. Our balancer *should* do this for you factoring in the > machine resources but currently it does not. > > One thing you could do is turn the balancer off and do the balancing > yourself externally. You can move regions either via the shell or > script. > ok,i will look java api to figure out how to move region. > > > 4- I need to implement different priority scanning so the highest > priority > > query should be serve through good machines and this can be done if i > able > > to place the priority data on good machines. e.g if time= busy hours then > > place data at good region servers.else if time=night then place data at > > normal servers. > > > > > > HBase will never let you do this. It won't scale. > > St.Ack > -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
