[
https://issues.apache.org/jira/browse/HBASE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack reopened HBASE-3381:
--------------------------
What I committed was crap. More often than not, we'd get woken up because the
worker thread was done but we'd then interrupt the worker thread though it was
on its way out. Made for confusing logging.
I have a new patch that I've been testing. Will put it up in a sec.
> Interrupt of a region open comes across as a successful open
> ------------------------------------------------------------
>
> Key: HBASE-3381
> URL: https://issues.apache.org/jira/browse/HBASE-3381
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 0.90.0
>
> Attachments: 3381.txt
>
>
> Meta was offline when below happened:
> {code}
> 2010-12-21 19:45:23,023 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12d0a53c540000e Attempting to transition node
> 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENING
> 2010-12-21 19:45:23,046 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12d0a53c540000e Successfully transitioned node
> 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENING
> 2010-12-21 19:45:26,379 DEBUG
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Interrupting
> thread Thread[PostOpenDeployTasks:337038b50e467fbd6b031f278bbd9c22,5,main]
> 2010-12-21 19:45:26,379 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> regionserver:60020-0x12d0a53c540000e Attempting to transition node
> 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_OPENED
> 2010-12-21 19:45:26,381 WARN
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Exception
> running postOpenDeployTasks; region=337038b50e467fbd6b031f278bbd9c22
> org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Interrupted
> at
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364)
> at
> org.apache.hadoop.hbase.catalog.MetaEditor.updateRegionLocation(MetaEditor.java:146)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1331)
> at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:195)
> ...
> {code}
> So, we timed out trying to open the region but rather than close the region
> because edit failed, we missed seeing the InterruptedException.
> Here is suggested fix:
> {code}
> diff --git a/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> b/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> index 7bf680d..2b0078c 100644
> --- a/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> +++ b/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> @@ -339,7 +339,7 @@ public class MetaReader {
> get.addFamily(HConstants.CATALOG_FAMILY);
> byte [] meta = getCatalogRegionNameForRegion(regionName);
> Result r = catalogTracker.waitForMetaServerConnectionDefault().get(meta,
> get);
> - if(r == null || r.isEmpty()) {
> + if (r == null || r.isEmpty()) {
> return null;
> }
> return metaRowToRegionPair(r);
> {code}
> Let me try it.
> W/o this, what we see is hbck showing that region is on server X but in
> .META. it shows as being on Y (its pre-balance server)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.