[
https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698037#comment-14698037
]
Hadoop QA commented on HBASE-14207:
-----------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12750536/HBASE-14207-0.98-V2.patch
against 0.98 branch at commit 45aafb25b7911b5917ab47133e3e4268806e4c91.
ATTACHMENT ID: 12750536
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.
{color:green}+1 hadoop versions{color}. The patch compiles with all
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 protoc{color}. The applied patch does not increase the
total number of protoc compiler warnings.
{color:red}-1 javadoc{color}. The javadoc tool appears to have generated
23 warning messages.
{color:green}+1 checkstyle{color}. The applied patch does not increase the
total number of checkstyle errors
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100
{color:red}-1 site{color}. The patch appears to cause mvn post-site goal
to fail.
{color:green}+1 core tests{color}. The patch passed unit tests in .
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/15106//testReport/
Release Findbugs (version 2.0.3) warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/15106//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/15106//artifact/patchprocess/checkstyle-aggregate.html
Javadoc warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/15106//artifact/patchprocess/patchJavadocWarnings.txt
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/15106//console
This message is automatically generated.
> Region was hijacked and remained in transition when RS failed to open a
> region and later regionplan changed to new RS on retry
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-14207
> URL: https://issues.apache.org/jira/browse/HBASE-14207
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.98.6
> Reporter: Pankaj Kumar
> Assignee: Pankaj Kumar
> Priority: Critical
> Fix For: 0.98.15
>
> Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98.patch
>
>
> On production environment, following events happened
> 1. Master is trying to assign a region to RS, but due to
> KeeperException$SessionExpiredException RS failed to open the region.
> In RS log, saw multiple WARN log related to
> KeeperException$SessionExpiredException
> > KeeperErrorCode = Session expired for
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
> > Unable to get data of znode
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
> 2. Master retried to assign the region to same RS, but RS again failed.
> 3. On second retry new plan formed and this time plan destination (RS) is
> different, so master send the request to new RS to open the region. But new
> RS failed to open the region as there was server mismatch in ZNODE than the
> expected current server name.
> Logs Snippet:
> {noformat}
> HM
> 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Processing
> 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE |
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644)
> 2015-07-14 03:50:29,759 | INFO | master:T101PC03VM13:21300 | Transitioned
> {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679,
> server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN,
> ts=1436817029759, server=T101PC03VM13,21302,1436816690692} |
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:29,760 | INFO | master:T101PC03VM13:21300 | Processed
> region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on
> server: T101PC03VM13,21302,1436816690692 |
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768)
> 2015-07-14 03:50:29,800 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
> T101PC03VM13,21302,1436816690692 |
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:29,801 | WARN |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1
> of 10 |
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:29,802 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
> the same failed server. |
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123)
> 2015-07-14 03:50:31,804 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
> T101PC03VM13,21302,1436816690692 |
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:31,806 | WARN |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2
> of 10 |
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:31,807 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned
> {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804,
> server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b
> state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} |
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:31,807 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to
> T101PC03VM14,21302,1436816997967 |
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:31,807 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned
> {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817031807,
> server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b
> state=PENDING_OPEN, ts=1436817031807,
> server=T101PC03VM14,21302,1436816997967} |
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:51:09,501 | INFO |
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-4 | Skip assigning region in
> transition on other server{08f1935d652e5dbdac09b423b8f9401b
> state=PENDING_OPEN, ts=1436817031807,
> server=T101PC03VM14,21302,1436816997967} |
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:250)
> {noformat}
> {noformat}
> RS - T101PC03VM14
> 2015-07-14 03:50:31,809 | INFO |
> PriorityRpcServer.handler=2,queue=0,port=21302 | Open
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. |
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:3671)
> 2015-07-14 03:50:31,830 | WARN | RS_OPEN_REGION-T101PC03VM14:21302-2 |
> regionserver:21302-0xe4e88f6f1b70002,
> quorum=t101pc03vm12:24002,t101pc03vm13:24002,t101pc03vm14:24002,
> baseZNode=/hbase Attempt to transition the unassigned node for
> 08f1935d652e5dbdac09b423b8f9401b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING failed, the server that tried to transition was
> T101PC03VM14,21302,1436816997967 not the expected
> T101PC03VM13,21302,1436816690692 |
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:875)
> 2015-07-14 03:50:31,830 | WARN | RS_OPEN_REGION-T101PC03VM14:21302-2 |
> Failed transition from OFFLINE to OPENING for
> region=08f1935d652e5dbdac09b423b8f9401b |
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.transitionZookeeperOfflineToOpening(OpenRegionHandler.java:539)
> 2015-07-14 03:50:31,831 | WARN | RS_OPEN_REGION-T101PC03VM14:21302-2 |
> Region was hijacked? Opening cancelled for
> encodedName=08f1935d652e5dbdac09b423b8f9401b |
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:132)
> 2015-07-14 03:50:31,831 | INFO | RS_OPEN_REGION-T101PC03VM14:21302-2 |
> Opening of region {ENCODED => 08f1935d652e5dbdac09b423b8f9401b, NAME =>
> 'INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b.',
> STARTKEY => '', ENDKEY => '200'} failed, transitioning from OFFLINE to
> FAILED_OPEN in ZK, expecting version -1 |
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.tryTransitionFromOfflineToFailedOpen(OpenRegionHandler.java:436)
> 2015-07-14 03:50:31,834 | WARN | RS_OPEN_REGION-T101PC03VM14:21302-2 |
> regionserver:21302-0xe4e88f6f1b70002,
> quorum=t101pc03vm12:24002,t101pc03vm13:24002,t101pc03vm14:24002,
> baseZNode=/hbase Attempt to transition the unassigned node for
> 08f1935d652e5dbdac09b423b8f9401b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_FAILED_OPEN failed, the server that tried to transition was
> T101PC03VM14,21302,1436816997967 not the expected
> T101PC03VM13,21302,1436816690692 |
> org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:875)
> 2015-07-14 03:50:31,834 | WARN | RS_OPEN_REGION-T101PC03VM14:21302-2 |
> Unable to mark region {ENCODED => 08f1935d652e5dbdac09b423b8f9401b, NAME =>
> 'INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b.',
> STARTKEY => '', ENDKEY => '200'} as FAILED_OPEN. It's likely that the master
> already timed out this open attempt, and thus another RS already has the
> region. |
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.tryTransitionFromOfflineToFailedOpen(OpenRegionHandler.java:444)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)