Thanks Stack and Ted, Yes, it looks like just the case of HBASE-3872.
Regarding *'is multiply assigned to region servers'* I found these messages after running add_table.rb, and assign them. Maybe before executes add_table.rb, we should disable the table? Or use 'unassign'. Regarding *the recovery script I attached*. After I run the script, I can insert values in that region now. But hbck reports ==== *Chain of regions in table STable contains less elements than are listed in META; visited=64035, edges=64044 ERROR: Found inconsistency in table STable *==== I did check hbck before the execution, to set most recent correct startkey and endkey of the missing meta record, but it looks like the execution introduces some short-cut path in the meta? I guess it might cause loss of data in that 9 regions. Is there any tools to check out the hfiles on fs, to validate the data, if we can found out those 9 regions(we'll go through the .META.)? Thanks and regards, Mao Xu-Feng On Thu, Jul 7, 2011 at 3:21 AM, Stack <[email protected]> wrote: > On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao <[email protected]> wrote: > > I looks like we've lost a region, include the directory on hdfs and its > meta > > record as well. We need some more time to dig into the log sea, to figure > > out the root cause. > > > > You think it was https://issues.apache.org/jira/browse/HBASE-3872? > > > But first of all, we need to recover the meta, so that we can put keys in > > that region. My understanding is the check_meta.rb and add_table.rb could > > fix some meta issues in case the directory on hdfs and its .regioninfo > still > > exists. > > > > Yes. add_table.rb will go out on fs and find regions for the table > and rewrite that portion of .META. In 0.90 it will not assign them > though you will likely need to disable then reenable the table to get > the regions out on the cluster. > > Check_meta is likely the same. It looks for the hole and if you pass > the -fix, will create a new region to plug the hole. This is probably > what you need (You may need to assign the region post running the > script). > > > I modified the check_meta.rb, to achieve the insertion. I've tried in our > > environment, it seems work, at least hbase hbck tells me okay. I attached > it > > with this message.Any comments is great appreciated. > > > > Good. > > > I have one more question. I create the new region record with both > startkey > > and endkey set, it seems possible that if we're unlucky, during the > > insertion, some split happens, then we might lead to overlap region. I > > wonder how hbase handles this sort of problems generally. > > > > Well, you can't do cross-row transactions which is sort of what you > would need here in this case so, yes, its possible that there could be > overlap, though, didn't you say the region was missing? (If so, how > could it split?). > > > When I was playing with the test environment, I saw message like some > region > > 'is multiply assigned to region servers', it is also a inconsistent > > scenario, how can I recover this problem? > > > > Can you figure how this double-assign happened? > > To 'recover' you'd close it on one of the regionservers. Send a > close_region 'REGION_NAME', 'SERVER_NAME' in the shell (Read the shell > close_region help to be sure for my memory is not reliable). > > St.Ack >
