I see. Thanks Ryan. -- Weiwei
On Mon, Mar 14, 2011 at 3:28 PM, Ryan Rawson <[email protected]> wrote: > by default runs 1x/day. you can do it manually in the hbase shell by > typing: > > hbase(main):001:0> major_compact "table_name" > > -ryan > > > On Mon, Mar 14, 2011 at 3:25 PM, Weiwei Xiong <[email protected]> wrote: > > Thanks for your info Ryan. > > Does HBase do major compaction regularly or do I need to manually do > this? > > If it's automatic, how frequently is it performed? > > I am running 1 replication. > > Thanks, > > -- Weiwei > > > > On Mon, Mar 14, 2011 at 3:18 PM, Ryan Rawson <[email protected]> wrote: > >> > >> HDFS does the data rebalancing, over time as major compactions and new > >> data comes in, files are written first to the local node then to > >> remote nodes. > >> > >> Whats the replication factor you are running? HDFS on 2 nodes is > >> tricky, since you can either choose r=1 (no data protection) or r=2 > >> (all writes go to both nodes). > >> > >> The sweet spot is above 6 nodes alas. > >> > >> -ryan > >> > >> On Mon, Mar 14, 2011 at 3:12 PM, Weiwei Xiong <[email protected]> > wrote: > >> > Sorry I forgot to mention. I am using HBase 0.90.1 over HDFS > 0.20.append > >> > Thanks, > >> > -- Weiwei > >> > > >> > On Mon, Mar 14, 2011 at 3:10 PM, Weiwei Xiong <[email protected]> > wrote: > >> >> > >> >> Thanks very much for your replies. > >> >> Something was unclear in my previous emails. I had one node started > >> >> first > >> >> and another was added in later. And there're already some regions > >> >> created in > >> >> the first started node. Then I started to import more data into the > >> >> same > >> >> table and found that it's always the first node that keeps serving > the > >> >> data > >> >> writes. > >> >> Actually I was expecting that the region data would be re-balanced to > >> >> another data node. And I did see in the master log that HBase master > is > >> >> trying to unassigning some regions from the overloaded node and > >> >> re-assign > >> >> them to the less-loaded node. But the real data was never migrated. > >> >> I think I observed the region index and cache rebalancing from the > >> >> master > >> >> log (correct me if I were wrong). Does anyone know how frequently > this > >> >> happens? > >> >> Another question is, does HBase support data and I/O rebalancing? Or > I > >> >> should rely on HDFS to do data rebalancing? I guess HBase should also > >> >> support data rebalancing otherwise every time I restart HBase the > >> >> regions > >> >> will have to be rebalanced again. Will someone tell me how to > configure > >> >> or > >> >> program HBase to do data rebalancing? > >> >> Thanks, > >> >> -- Weiwei > >> >> On Mon, Mar 14, 2011 at 2:43 PM, Ryan Rawson <[email protected]> > >> >> wrote: > >> >>> > >> >>> What version of HBase are you testing? > >> >>> > >> >>> Is it literally 0 vs N assignments? > >> >>> > >> >>> On Mon, Mar 14, 2011 at 1:18 PM, Weiwei Xiong <[email protected]> > >> >>> wrote: > >> >>> > Thanks! > >> >>> > > >> >>> > I checked the master log and found some info like this: > >> >>> > " timestamp ***, INFO org.apache.hadoop.hbase.master.HMaster: > >> >>> > balance > >> >>> > hri=***, src=***, dst=*** " > >> >>> > > >> >>> > So I assume the balancer is running. There's no failing info > there, > >> >>> > but > >> >>> > I > >> >>> > didn't see the regions were actually balanced as the log states. > >> >>> > > >> >>> > Is it possible that I have been keeping dumping data into the > table > >> >>> > thus the > >> >>> > balancing won't work? > >> >>> > > >> >>> > Thanks, > >> >>> > -- Weiwei > >> >>> > > >> >>> > On Mon, Mar 14, 2011 at 12:15 PM, Stack <[email protected]> wrote: > >> >>> > > >> >>> >> Check the master log. See if the load balancer is running or > not. > >> >>> >> It > >> >>> >> usually runs every 5 minutes by default. It may not run if > regions > >> >>> >> are transitioning. It'll log regardless. > >> >>> >> > >> >>> >> St.Ack > >> >>> >> > >> >>> >> On Mon, Mar 14, 2011 at 10:50 AM, Weiwei Xiong < > [email protected]> > >> >>> >> wrote: > >> >>> >> > Hi, > >> >>> >> > > >> >>> >> > I recently set up a 2-node Hadoop and HBase cluster and am > trying > >> >>> >> > to > >> >>> >> > load > >> >>> >> > data into my HBase table using HBase client. > >> >>> >> > > >> >>> >> > The issue bothers me is that the data are always written into > one > >> >>> >> > node of > >> >>> >> > the cluster, i.e., all the regions of the hbase table are on > one > >> >>> >> > node. > >> >>> >> > > >> >>> >> > Is there any configuration I need to change for make the load > >> >>> >> > balanced? > >> >>> >> > > >> >>> >> > Thanks, > >> >>> >> > -- w > >> >>> >> > > >> >>> >> > >> >>> > > >> >> > >> > > >> > > > > > >
