(please don't leave unrelated discussions at the tail of your emails)

So I thought I never got that issue but wanted to make sure so I
grepped my logs and indeed saw I got it, so I what I did is that I
grepped the name of one of the regions that got the issue and looked
at what was happening at that time (which you should do in the
future). I see something like this:

2011-04-05 15:12:19,037 DEBUG
org.apache.hadoop.hbase.zookeeper.ZKUtil:
master:60000-0x42ec2cece810b68 Retrieved 115 byte(s) of data from
znode /prodjobs/unassigned/0db7d1f58e4fced0a371aded0ddec281 and set
watcher; region=tsdb,�M<��,1297818092053.0db7d1f58e4fced0a371aded0ddec281.,
server=sv4borg36,60020,1300313562191, state=RS_ZK_REGION_OPENED
2011-04-05 15:12:19,037 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENED, server=sv4borg36,60020,1300313562191,
region=0db7d1f58e4fced0a371aded0ddec281
...
2011-04-05 15:12:19,585 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Regions in
transition timed out:
tsdb,\x00\x03\xCBM<}\x08\x00\x00\x01\x00\x00\x8A\x00\x00\x1D\x00\x01\xD1,1297818092053.0db7d1f58e4fced0a371aded0ddec281.
state=OPEN, ts=1302041472920
2011-04-05 15:12:19,585 ERROR
org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN
for too long, we don't know where region was opened so can't do
anything
...
2011-04-05 15:12:22,504 DEBUG
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling
OPENED event for 0db7d1f58e4fced0a371aded0ddec281; deleting unassigned
node


So if I understand this correctly, the master already got the message
via ZooKeeper but it stayed in a queue for just long enough that the
RIT timed out and finally the OpenedRegionHandler was able to process
it. So in the end nothing looks broken, it just means that the master
is processing a LOT of regions being opened, while it also took the
region server a long time to get the region opened.

There are currently a few states that don't get refreshed in ZK, for
example when a region is sitting in the region server's queue of
regions to be opened. Very often, when there's a lot of regions to
open (and worse if the RS has to replay recovered edits), some regions
in that state will timeout. This needs more thinking.

J-D

2011/4/13 Gaojinchao <[email protected]>:
> In hbase version 0.90.1 .
>
> Is there any experience ?
>
> Hmaster Logs :
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,384 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
> 2011-04-08 16:33:09,385 ERROR 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for 
> too long, we don't know where region was opened so can't do anything
>

Reply via email to