Hi Dimitry, That is interesting. I have seen this before, can you please send a hadoop fs -lsr /hbase/documents? This is going to be caused by a bad split. I will let you know what files you need to delete to safely recover from this error.
On Thu, May 2, 2013 at 10:17 AM, Dimitri Goldin <[email protected]>wrote: > Hi, > > I have a strange RIT problem with a single region of our biggest table. > After an hbck (wondering why it only discovered it at that time) it > started trying to assign a region which has been bouncing between > OFFLINE/PENDING_OPEN/OPENING for two days. > > I already tried close_region/unassign with force and even the good-old > delete /hbase node in zookeeper, but we still experience the same issue. > > Interestinly, the full regions id is > 'documents,**7128586022887322720,**1363696791400.** > 79c619508659018ff3ef0887611eb8**f7.' > but in the exception the filename it tries to open is: > '/hbase/documents/**5b9c16898a371de58f31f0bdf86b1f**8b/d/** > 0707b1ec4c6b41cf9174e0d2a1785f**e9'. > > Rough sequence from the logs seems to be the following: > > === > * Received request to open region: > documents,7128586022887322720,**1363696791400.** > 79c619508659018ff3ef0887611eb8**f7. > > * Setting up tabledescriptor config now ... > > * Opening of region {NAME => > 'documents,**7128586022887322720,**1363696791400.** > 79c619508659018ff3ef0887611eb8**f7.', > STARTKEY => '7128586022887322720', > ENDKEY => '7130716361635801616', > ENCODED => 79c619508659018ff3ef0887611eb8**f7,} failed, marking as > FAILED_OPEN in ZK > > * File does not exist: > > /hbase/documents/**5b9c16898a371de58f31f0bdf86b1f**8b/d/** > 0707b1ec4c6b41cf9174e0d2a1785f**e9 [...] > === > > As the Exception implies, '/hbase/documents/** > 5b9c16898a371de58f31f0bdf86b1f**8b' does not exist, > while the '/hbase/documents/**79c619508659018ff3ef0887611eb8**f7' folder > exists and contains all necessary files. > > I've checked .META. thinking that the regions ENCODED field might > be broken, which is not the case judging by the 3rd. log-message. > Otherwise, I'm out of ideas how the encoded-region part might get > switched with another value. > > Any ideas what might cause such a behaviour and how to fix it? > > HBase version: 0.92.1-cdh4.1.2 > > Complete log-message including stacktrace of the FileNotFound > Exception: > http://fpaste.org/10005/**04104136/<http://fpaste.org/10005/04104136/>(Sorry > for the format) > > > Thanks in advance, > Dimitry > > -- > ------------------------------**---- > Dimitry Goldin > Software Developer > > Neofonie GmbH > Robert-Koch-Platz 4 > 10115 Berlin > > T: +49 30 246 27 413 > > [email protected] <mailto:[email protected]> > http://www.neofonie.de > > Handelsregister > Berlin-Charlottenburg: HRB 67460 > > Geschäftsführung: > Thomas Kitlitschko > -- Kevin O'Dell Systems Engineer, Cloudera
