The attachment didn't go through. Can you put the file on pastebin ? Or you can open a JIRA and attach it there.
Thanks On Jul 6, 2011, at 5:37 AM, Xu-Feng Mao <[email protected]> wrote: > Hi, > > I looks like we've lost a region, include the directory on hdfs and its meta > record as well. We need some more time to dig into the log sea, to figure out > the root cause. > > But first of all, we need to recover the meta, so that we can put keys in > that region. My understanding is the check_meta.rb and add_table.rb could fix > some meta issues in case the directory on hdfs and its .regioninfo still > exists. > > In our situation however, since we could not find the region directory any > longer, it seems that all we could do is still insert a record into the meta, > then assign it. > > I modified the check_meta.rb, to achieve the insertion. I've tried in our > environment, it seems work, at least hbase hbck tells me okay. I attached it > with this message.Any comments is great appreciated. > > I have one more question. I create the new region record with both startkey > and endkey set, it seems possible that if we're unlucky, during the > insertion, some split happens, then we might lead to overlap region. I wonder > how hbase handles this sort of problems generally. > > When I was playing with the test environment, I saw message like some region > 'is multiply assigned to region servers', it is also a inconsistent scenario, > how can I recover this problem? > > Thanks and regards, > > Mao Xu-Feng > > ---------- Forwarded message ---------- > From: Xu-Feng Mao <[email protected]> > Date: Wed, Jul 6, 2011 at 7:20 AM > Subject: Re: WrongRegionException and inconsistent table found > To: Xu-Feng Mao <[email protected]> > Cc: "[email protected]" <[email protected]>, > "[email protected]" <[email protected]> > > > I forgot the version, we are using cdh3u0. > > Mao Xu-Feng > > 在 2011-7-6,0:59,Xu-Feng Mao <[email protected]> 写道: > >> We also check the master log, nothing interesting found. >> >> On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao <[email protected]> wrote: >> Hi, >> >> We're running a hbase cluster including 37 regionservers. Today, we found >> losts of WrongRegionException when putting object into it. >> >> hbase hbck -details >> reports that >> ==== >> Chain of regions in table STable is broken; edges does not contain >> ztxrGmCwn-6BE32s3cX1TNeHU_I= >> ERROR: Found inconsistency in table STable >> ==== >> >> echo "scan '.META.'"| hbase shell &> meta.txt >> grep -A1 "STARTKEY => 'EStore_everbox_z" meta.txt >> reports that >> ==== >> Ck=,1308802977279.71ffb1 1ffb10b8b95fd47b3eff468d00ab4e9.', >> STARTKEY => 'ztn0ukLW >> 0b8b95fd47b3eff468d00ab4 d1NSU3fuXKkkWq5ZVCk=', ENDKEY => >> 'ztqdVD8fCMP-dDbXUAydan >> e9. kboD4=', ENCODED => >> 71ffb10b8b95fd47b3eff468d00ab4e9, TABLE => {{NAME = >> -- >> D4=,1305619724446.c45191 45191821053d03537596f4a2e759718.', >> STARTKEY => ztqdVD8f >> 821053d03537596f4a2e7597 CMP-dDbXUAydankboD4=', ENDKEY => >> 'ztxrGmCwn-6BE32s3cX1TN >> 18. eHU_I=', ENCODED => >> c45191821053d03537596f4a2e759718, TABLE => {{NAME = >> -- >> pA=,1309455605341.c5c5f5 5c5f578722ea3f8d1b099313bec8298.', >> STARTKEY => 'zu3zVaLc >> 78722ea3f8d1b099313bec82 GDnnpjKCbnboXgAFspA=', ENDKEY => >> 'zu7qkr5fH6MMJ3GxbCv_0d >> 98. 6g8yI=', ENCODED => >> c5c5f578722ea3f8d1b099313bec8298, TABLE => {{NAME = >> ==== >> >> It looks like the meta indeed has a hole.(We tried scan '.META.' several >> times, to confirm it's not a transient status.) >> We've tried hbase hbck -fix, does not help. >> >> We found a thread 'wrong region exception' about two months ago. Stack >> suggested a 'little surgery' like >> ==== >> So, make sure you actually have a hole. Dump out your meta table: >> >> echo "scan '.META.'"| ./bin/hbase shell &> /tmp/meta.txt >> >> Then look ensure that there is a hole between the above regions >> (compare start and end keys... the end key of one region needs to >> match the start key of the next). >> >> If indeed a hole, you need to do a little surgery inserting a new >> missing region (hbck should fix this but it doesn't have the smarts >> just yet). >> >> Basically, you create a new region with start and end keys to fill the >> hole then you insert it into .META. and then assign it. There are >> some scripts in our bin directory that do various parts of this. I'm >> pretty sure its beyond any but a few figuring this mess out so if you >> do the above foot work and provide a few more details, I'll hack up >> something for you (and hopefully something generalized to be use by >> others later, and later to be integrated into hbck). >> ==== >> >> Can anyone give a detailed example, step by step instruction would be >> greatly appreciated. >> My understand is we should >> 1.Since we already has the lost region, we now have start and end keys. >> 2.generate the row represents the missing region. But how can I generate the >> encoded name? >> It looks like I need >> column=info:server,column=info:serverstartcode and column=info:regioninfo >> for the missing region. >> And column=info:regioninfo includes so many information. How to generate >> them one by one? >> As for the name of row, it consists of tablename, startkey, encode, and one >> more long number, >> how to get this number? >> 3.use assing command in the hbase shell >> >> We also tried check_meta.rb --fix, it reports >> ==== >> 11/07/06 00:09:08 WARN check_meta: hole after REGION => {NAME => >> 'STable,ztqdVD8fCMP-dDbXUAydankboD4=,1305619724446.c45191821053d03537596f4a2e759718.', >> STARTKEY => 'ztqdVD8fCMP-dDbXUAydankboD4=', ENDKEY => >> 'ztxrGmCwn-6BE32s3cX1TNeHU_I=', ENCODED => c45191821053d03537596f4a2e759718, >> TABLE => {{NAME => 'STable', FAMILIES => [{NAME => 'file', BLOOMFILTER => >> 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', >> TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE >> => 'true'}, {NAME => 'filelength', BLOOMFILTER => 'NONE', REPLICATION_SCOPE >> => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', >> BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => >> 'userbucket', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION >> => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', >> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'userpass', >> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', >> VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => >> 'false', BLOCKCACHE => 'true'}]}} >> 11/07/06 00:28:40 WARN check_meta: Missing .regioninfo: >> hdfs://hd0013.c.gj.com:9000/hbase/STable/3e6faca40a7ccad7ed8c0b5848c0f945/.regioninfo >> ==== >> >> The problem is still there. BTW, what about the blue warning? Is this a >> serious issue? >> The situation is quite hard to us, it looks like even we can fill the hole >> in the meta, we would lost all the data in the hole region, right? >> >> Thanks and regards, >> >> Mao Xu-Feng >> >
