I see.  Thanks Ryan.

-- Weiwei

On Mon, Mar 14, 2011 at 3:28 PM, Ryan Rawson <[email protected]> wrote:

> by default runs 1x/day. you can do it manually in the hbase shell by
> typing:
>
> hbase(main):001:0> major_compact "table_name"
>
> -ryan
>
>
> On Mon, Mar 14, 2011 at 3:25 PM, Weiwei Xiong <[email protected]> wrote:
> > Thanks for your info Ryan.
> > Does HBase do major compaction regularly or do I need to manually do
> this?
> > If it's automatic, how frequently is it performed?
> > I am running 1 replication.
> > Thanks,
> > -- Weiwei
> >
> > On Mon, Mar 14, 2011 at 3:18 PM, Ryan Rawson <[email protected]> wrote:
> >>
> >> HDFS does the data rebalancing, over time as major compactions and new
> >> data comes in, files are written first to the local node then to
> >> remote nodes.
> >>
> >> Whats the replication factor you are running?  HDFS on 2 nodes is
> >> tricky, since you can either choose r=1 (no data protection) or r=2
> >> (all writes go to both nodes).
> >>
> >> The sweet spot is above 6 nodes alas.
> >>
> >> -ryan
> >>
> >> On Mon, Mar 14, 2011 at 3:12 PM, Weiwei Xiong <[email protected]>
> wrote:
> >> > Sorry I forgot to mention. I am using HBase 0.90.1 over HDFS
> 0.20.append
> >> > Thanks,
> >> > -- Weiwei
> >> >
> >> > On Mon, Mar 14, 2011 at 3:10 PM, Weiwei Xiong <[email protected]>
> wrote:
> >> >>
> >> >> Thanks very much for your replies.
> >> >> Something was unclear in my previous emails. I had one node started
> >> >> first
> >> >> and another was added in later. And there're already some regions
> >> >> created in
> >> >> the first started node. Then I started to import more data into the
> >> >> same
> >> >> table and found that it's always the first node that keeps serving
> the
> >> >> data
> >> >> writes.
> >> >> Actually I was expecting that the region data would be re-balanced to
> >> >> another data node. And I did see in the master log that HBase master
> is
> >> >> trying to unassigning some regions from the overloaded node and
> >> >> re-assign
> >> >> them to the less-loaded node. But the real data was never migrated.
> >> >> I think I observed the region index and cache rebalancing from the
> >> >> master
> >> >> log (correct me if I were wrong).  Does anyone know how frequently
> this
> >> >> happens?
> >> >> Another question is, does HBase support data and I/O rebalancing? Or
> I
> >> >> should rely on HDFS to do data rebalancing? I guess HBase should also
> >> >> support data rebalancing otherwise every time I restart HBase the
> >> >> regions
> >> >> will have to be rebalanced again. Will someone tell me how to
> configure
> >> >> or
> >> >> program HBase to do data rebalancing?
> >> >> Thanks,
> >> >> -- Weiwei
> >> >> On Mon, Mar 14, 2011 at 2:43 PM, Ryan Rawson <[email protected]>
> >> >> wrote:
> >> >>>
> >> >>> What version of HBase are you testing?
> >> >>>
> >> >>> Is it literally 0 vs N assignments?
> >> >>>
> >> >>> On Mon, Mar 14, 2011 at 1:18 PM, Weiwei Xiong <[email protected]>
> >> >>> wrote:
> >> >>> > Thanks!
> >> >>> >
> >> >>> > I checked the master log and found some info like this:
> >> >>> > " timestamp ***, INFO org.apache.hadoop.hbase.master.HMaster:
> >> >>> > balance
> >> >>> > hri=***, src=***, dst=*** "
> >> >>> >
> >> >>> > So I assume the balancer is running. There's no failing info
> there,
> >> >>> > but
> >> >>> > I
> >> >>> > didn't see the regions were actually balanced as the log states.
> >> >>> >
> >> >>> > Is it possible that I have been keeping dumping data into the
> table
> >> >>> > thus the
> >> >>> > balancing won't work?
> >> >>> >
> >> >>> > Thanks,
> >> >>> > -- Weiwei
> >> >>> >
> >> >>> > On Mon, Mar 14, 2011 at 12:15 PM, Stack <[email protected]> wrote:
> >> >>> >
> >> >>> >> Check the master log.  See if the load balancer is running or
> not.
> >> >>> >>  It
> >> >>> >> usually runs every 5 minutes by default.  It may not run if
> regions
> >> >>> >> are transitioning.  It'll log regardless.
> >> >>> >>
> >> >>> >> St.Ack
> >> >>> >>
> >> >>> >> On Mon, Mar 14, 2011 at 10:50 AM, Weiwei Xiong <
> [email protected]>
> >> >>> >> wrote:
> >> >>> >> > Hi,
> >> >>> >> >
> >> >>> >> > I recently set up a 2-node Hadoop and HBase cluster and am
> trying
> >> >>> >> > to
> >> >>> >> > load
> >> >>> >> > data into my HBase table using HBase client.
> >> >>> >> >
> >> >>> >> > The issue bothers me is that the data are always written into
> one
> >> >>> >> > node of
> >> >>> >> > the cluster, i.e., all the regions of the hbase table are on
> one
> >> >>> >> > node.
> >> >>> >> >
> >> >>> >> > Is there any configuration I need to change for make the load
> >> >>> >> > balanced?
> >> >>> >> >
> >> >>> >> > Thanks,
> >> >>> >> > -- w
> >> >>> >> >
> >> >>> >>
> >> >>> >
> >> >>
> >> >
> >> >
> >
> >
>

Reply via email to