attached is my original email to the list, which contains code for a tool to 
repair your "hole" in .META.



-----Original Message-----
From: Stuart Smith [mailto:[email protected]] 
Sent: Saturday, October 29, 2011 1:39 PM
To: [email protected]
Subject: Re: PENDING_CLOSE for too long

Hello Geoff,

  I usually don't show up here, since I use CDH, and good form means I should 
stay on CDH-users,
But!
  I've been seeing the same issues for months:

 - PENDING_CLOSE too long, master tries to reassign - I see an continuous 
stream of these.
 - WrongRegionExceptions due to overlapping regions & holes in the regions.

I just spent all day yesterday cribbing off of St.Ack's check_meta.rb script to 
write a java program to fix up overlaps & holes in an offline fashion (hbase 
down, directly on hdfs), and will start testing next week (cross my fingers!).

It seems like the pending close messages can be ignored?
And once I test my tool, and confirm I know a little bit about what I'm doing, 
maybe we could share notes?

Take care,
  -stu



________________________________
From: Geoff Hendrey <[email protected]>
To: [email protected]
Cc: [email protected]
Sent: Saturday, September 3, 2011 12:11 AM
Subject: RE: PENDING_CLOSE for too long

"Are you having trouble getting to any of your data out in tables?"

depends what you mean. We see corruptions from time to time that prevent
us from getting data, one way or another. Today's corruption was regions
with duplicate start and end rows. We fixed that by deleting the
offending regions from HDFS, and running add_table.rb to restore the
meta. The other common corruption is the holes in ".META." that we
repair with a little tool we wrote. We'd love to learn why we see these
corruptions with such regularity (seemingly much higher than others on
the list).

We will implement timeout you suggest, and see how it goes.

Thanks,
Geoff

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of
Stack
Sent: Friday, September 02, 2011 10:51 PM
To: [email protected]
Cc: [email protected]
Subject: Re: PENDING_CLOSE for too long

Are you having trouble getting to any of your data out in tables?

To get rid of them, try restarting your master.

Before you restart your master, do "HBASE-4126  Make timeoutmonitor
timeout after 30 minutes instead of 3"; i.e. set
"hbase.master.assignment.timeoutmonitor.timeout" to 1800000 in
hbase-site.xml.

St.Ack

On Fri, Sep 2, 2011 at 1:40 PM, Geoff Hendrey <[email protected]>
wrote:
> In the master logs, I am seeing "regions in transition timed out" and
> "region has been PENDING_CLOSE for too long, running forced unasign".
> Both of these log messages occur at INFO level, so I assume they are
> innocuous. Should I be concerned?
>
>
>
> -geoff
>
>
--- Begin Message ---
Here is some code to fix .META. hole…after running it you can copy the data 
from old

region in HDFS into newly created region. This is just hack code. Needs to be 
turned into a utility with decent parameters that can be passed in to automate 
it. Right now you manually paste your info into this code, and compile it…yuck. 
But it’s the hack I’ve got, and it works for me.

======================

 

import java.util.logging.Level;

import java.util.logging.Logger;

import org.apache.hadoop.hbase.client.Get;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.HConstants;

import org.apache.hadoop.hbase.HRegionInfo;

import org.apache.hadoop.hbase.HTableDescriptor;

import org.apache.hadoop.hbase.client.Delete;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.util.Bytes;

import org.apache.hadoop.hbase.util.Writables;

import org.apache.hadoop.hbase.client.HBaseAdmin;

 

public class FixMetaTable {

 

    public static String regionNameKey = "YOUR REGION **NAME** here”;//replace 
with YOUR regionName that needs fixin’

    public static void main(String[] args) throws InterruptedException {

        try {

            System.out.println("Entering the Program to Edit .META. table");

            Configuration hConfig = HBaseConfiguration.create();

            hConfig.set("hbase.zookeeper.quorum", "your-quorum-here");

 

            HBaseAdmin admin = new HBaseAdmin(hConfig);

            

            HTable hTable = new HTable(hConfig, Bytes.toBytes(".META."));

            Get get = new Get(Bytes.toBytes(regionNameKey));

            Result result = hTable.get(get);

            byte[] bytes = result.getValue(HConstants.CATALOG_FAMILY, 
HConstants.REGIONINFO_QUALIFIER);

 

            HRegionInfo closedRegion = Writables.getHRegionInfo(bytes);

            admin.closeRegion(closedRegion.getRegionName(), null);//. Close the 
existing region if open.

            System.out.println("Closed the Region " + 
closedRegion.getRegionNameAsString());

 

 

 

 

            HTable readTable = new HTable(hConfig, Bytes.toBytes(".META."));

            Get readGet = new Get(Bytes.toBytes(regionNameKey));

            Result readResult = readTable.get(readGet);

            byte[] readBytes = readResult.getValue(HConstants.CATALOG_FAMILY, 
HConstants.REGIONINFO_QUALIFIER);

 

            HRegionInfo existingRegion = Writables.getHRegionInfo(readBytes); 
//Read the existing hregioninfo.

 

            System.out.println("Read the existing region info after closing " + 
existingRegion.getRegionNameAsString());

 

            HTableDescriptor descriptor = new 
HTableDescriptor(existingRegion.getTableDesc()); //Use existing hregioninfo 
htabledescriptor and this construction

            // Just changing the End key , nothing else

            HRegionInfo newRegion = new HRegionInfo(descriptor, 
Bytes.toBytes("startkey-of-row-whose-endkey-needs-fixing "), 
Bytes.toBytes("FIXME")); //FIXME should be the corrected endkey…should be the 
next startkey in .META. 

 

            byte[] value = Writables.getBytes(newRegion);

 

            Put put = new Put(newRegion.getRegionName()); //  Same time stamp 
from the record.

            put.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER, 
value);//Insert the new entry in .META. using new hregioninfo name as row key 
and add an info:regioninfo whose contents is the serialized new hregioninfo.

            HTable metaTable = new HTable(hConfig, ".META.");

            metaTable.put(put);

            System.out.println("Put a new Region " + 
newRegion.getRegionNameAsString() + " End key is " + 
Bytes.toString(newRegion.getEndKey()));

 

 

            Delete del = new Delete(closedRegion.getRegionName());//Delete the 
original row from .META.

            metaTable.delete(del);

 

            System.out.println("Deleted the closed region " + 
closedRegion.getRegionNameAsString());

 

            admin.assign(newRegion.getRegionName(), true); //Assign the new 
region.

            System.out.println("Assigned the new region " + 
newRegion.getRegionNameAsString());

 

        } catch (IOException ex) {

            Logger.getLogger(FixMetaTable.class.getName()).log(Level.SEVERE, 
null, ex);

        }

 

    }

}

 

From: Geoff Hendrey 
Sent: Wednesday, September 07, 2011 12:11 PM
To: [email protected]
Cc: [email protected]; [email protected]
Subject: Re: What's .META.'s hole?

 

Not in front of my laptop now, but I'll email our hole fixing code to the list 
this PM.

Sent from my iPhone


On Sep 7, 2011, at 11:57 AM, "Jonathan Hsieh" <[email protected]> wrote:

Hey Geoff,

I've been working on some code that should help show (HBASE-4321, HBASE-
43222) where holes are (and overlaps and other kinds of meta problems).

Can you point me to the jiras/code with fixup routines?

Thanks,
Jon.


On Sun, Sep 4, 2011 at 8:15 PM, Geoff Hendrey <[email protected]> wrote:

> a "hole" means when you run "hbase hbck" you see "chain of regions in
> table <TABLENAME> is broken; edges does not contain <STARTROW>"
>
> What happens is the endrow of a entry in .META. points to a nonexistent
> start row. Based on StAck's steps for fixing the problem, we wrote tool
> to repair it. Essentially the tool fixes the end row so that it points
> to the next startrow in .META.
>
> I believe Rohit posted the code that will fix the hole...
>
> -geoff
>
> -----Original Message-----
> From: Mingjian Deng [mailto:[email protected]]
> Sent: Sunday, September 04, 2011 6:10 PM
> To: [email protected]
> Subject: What's .META.'s hole?
>
> Hi All:
>    I don't know what does hole in .META. mean.
>    There is a regions in my cluster have "serverinfo" and "startcode"
> in
> .META. without "regioninfo". Is it hole? How did it happned and how to
> fix
> it?
>



--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [email protected]


--- End Message ---

Reply via email to