When a regionserver starts-up, it chooses a random number, its STARTCODE. Every time it reports in to the master, it volunteers its start code as part of its message.

Where regions are assigned to -- their SERVER in the log snippet below -- and their STARTCODES, are kept in distinct columns in the .META. table; i.e. each SERVER has an associated STARTCODE.

The master scans the .META. table on a period to ensure all regions are allocated. Part of its check ensures the STARTCODEs' regionservers have volunteered match what it has stamped into .META. If a discrepancy, something has happened; a restart of the cluster or at a minimum a crash or restart of that regionserver. The region is marked 'not valid' and the master goes about the business of cleanup and reassignment of the region.

Looking at code, we have the concept of an 'initial' scan. I wonder if things would run faster for you if on the initial scan we just cleared all SERVER and STARTCODE entries in .META. rather than wait on regionserver reports?

St.Ack



Clint Morgan wrote:
Hi all,

I'm having a little problem with our tests that use hbase.

First, I run a test which generate all of the hbase tables, and exits.

Then for each test, I copy over the hbase directory, and the start up hbase.

So far, so good, hbase quickly starts up and finds all my tables.
However, I then get NotServingRegion exceptions for the next minute or
so. Afterwards the regions get assigned, and everything is fine.

Looking at the logs, just before the regions start to come online, I see:

07/07/08 17:07:40] 60780  [ger.metaScanner] INFO
adoop.hbase.master.BaseScanner  - RegionManager.metaScanner scanning
meta region {regionname: .META.,,1, startKey: <>, server:
127.0.0.1:60012}
[07/07/08 17:07:40] 60746  [dler 0 on 60001] DEBUG
oop.hbase.master.ServerManager  - Total Load: 2, Num Servers: 1, Avg
Load: 2.0
[07/07/08 17:07:40] 60812  [ger.metaScanner] DEBUG
adoop.hbase.master.BaseScanner  - RegionManager.metaScannerREGION =>
{NAME => '__DDBC_META_TABLE__,,1215473430109', STARTKEY => '', ENDKEY
=> '', ENCODED => 928348903, TABLE => {NAME => '__DDBC_META_TABLE__',
FAMILIES => [{NAME => 'Meta', VERSIONS => 3, COMPRESSION => 'NONE',
IN_MEMORY => false, BLOCKCACHE => false, LENGTH => 2147483647, TTL =>
FOREVER, BLOOMFILTER => NONE}]}}, SERVER => '127.0.0.1:60012',
STARTCODE => 1215473423349
[07/07/08 17:07:40] 60812  [ger.metaScanner] DEBUG
adoop.hbase.master.BaseScanner  - Current assignment of
__DDBC_META_TABLE__,,1215473430109 is not valid: serverInfo: address:
127.0.0.1:60012, startcode: 1215475600737, load: (requests: 2 regions:
2), passed startCode: 1215473423349, storedInfo.startCode:
1215475600737, unassignedRegions: false, pendingRegions: false
...

My question: what is going on here, and how can I speed it up?

cheers,
-clint

Reply via email to