Re: Data is always written to one node

2011-03-15 Thread Bill Graham
On Mon, Mar 14, 2011 at 8:54 PM, Stack st...@duboce.net wrote: On Mon, Mar 14, 2011 at 4:09 PM, Bill Graham billgra...@gmail.com wrote: Anyway, it's been about a week and all regions for the table are still on 1 node. I see messages like this in the logs every 5 minutes: 2011-03-14

Re: Data is always written to one node

2011-03-14 Thread Stack
Check the master log. See if the load balancer is running or not. It usually runs every 5 minutes by default. It may not run if regions are transitioning. It'll log regardless. St.Ack On Mon, Mar 14, 2011 at 10:50 AM, Weiwei Xiong xion...@gmail.com wrote: Hi, I recently set up a 2-node

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
Thanks! I checked the master log and found some info like this: timestamp ***, INFO org.apache.hadoop.hbase.master.HMaster: balance hri=***, src=***, dst=*** So I assume the balancer is running. There's no failing info there, but I didn't see the regions were actually balanced as the log

Re: Data is always written to one node

2011-03-14 Thread Ryan Rawson
What version of HBase are you testing? Is it literally 0 vs N assignments? On Mon, Mar 14, 2011 at 1:18 PM, Weiwei Xiong xion...@gmail.com wrote: Thanks! I checked the master log and found some info like this: timestamp ***, INFO org.apache.hadoop.hbase.master.HMaster: balance hri=***,

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
Thanks very much for your replies. Something was unclear in my previous emails. I had one node started first and another was added in later. And there're already some regions created in the first started node. Then I started to import more data into the same table and found that it's always the

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
Sorry I forgot to mention. I am using HBase 0.90.1 over HDFS 0.20.append Thanks, -- Weiwei On Mon, Mar 14, 2011 at 3:10 PM, Weiwei Xiong xion...@gmail.com wrote: Thanks very much for your replies. Something was unclear in my previous emails. I had one node started first and another was

Re: Data is always written to one node

2011-03-14 Thread Ryan Rawson
HDFS does the data rebalancing, over time as major compactions and new data comes in, files are written first to the local node then to remote nodes. Whats the replication factor you are running? HDFS on 2 nodes is tricky, since you can either choose r=1 (no data protection) or r=2 (all writes

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
Thanks for your info Ryan. Does HBase do major compaction regularly or do I need to manually do this? If it's automatic, how frequently is it performed? I am running 1 replication. Thanks, -- Weiwei On Mon, Mar 14, 2011 at 3:18 PM, Ryan Rawson ryano...@gmail.com wrote: HDFS does the data

Re: Data is always written to one node

2011-03-14 Thread Ryan Rawson
by default runs 1x/day. you can do it manually in the hbase shell by typing: hbase(main):001:0 major_compact table_name -ryan On Mon, Mar 14, 2011 at 3:25 PM, Weiwei Xiong xion...@gmail.com wrote: Thanks for your info Ryan. Does HBase do major compaction regularly or do I need to manually do

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
I see. Thanks Ryan. -- Weiwei On Mon, Mar 14, 2011 at 3:28 PM, Ryan Rawson ryano...@gmail.com wrote: by default runs 1x/day. you can do it manually in the hbase shell by typing: hbase(main):001:0 major_compact table_name -ryan On Mon, Mar 14, 2011 at 3:25 PM, Weiwei Xiong

Re: Data is always written to one node

2011-03-14 Thread Bill Graham
I hope I'm not hijacking the thread but I'm seeing what I think is a similar issue. About a week ago I loaded a bunch of data into a newly created table. It took about an hour and resulted in 12 regions being created on a single node. (Afterwards I remembered a conversation with JD where he

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
On Mon, Mar 14, 2011 at 4:09 PM, Bill Graham billgra...@gmail.com wrote: I hope I'm not hijacking the thread but I'm seeing what I think is a similar issue. About a week ago I loaded a bunch of data into a newly created table. It took about an hour and resulted in 12 regions being created on

Re: Data is always written to one node

2011-03-14 Thread Stack
Data balancing on hdfs is different to region balancing across your nodes. Maybe there is a bug in our balancer if there are only two nodes involved? If there is nothing to balance, because its' already balanced, it'll output this: 2011-03-09 00:40:35,537 INFO

Re: Data is always written to one node

2011-03-14 Thread Stack
On Mon, Mar 14, 2011 at 4:09 PM, Bill Graham billgra...@gmail.com wrote: Anyway, it's been about a week and all regions for the table are still on 1 node. I see messages like this in the logs every 5 minutes: 2011-03-14 15:59:03,148 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping

Re: Data is always written to one node

2011-03-14 Thread Weiwei Xiong
On Mon, Mar 14, 2011 at 8:50 PM, Stack st...@duboce.net wrote: Data balancing on hdfs is different to region balancing across your nodes. Maybe there is a bug in our balancer if there are only two nodes involved? If there is nothing to balance, because its' already balanced, it'll output