[
https://issues.apache.org/jira/browse/HBASE-19726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351197#comment-16351197
]
Hadoop QA commented on HBASE-19726:
-----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m
0s{color} | {color:red} The patch doesn't appear to include any new or modified
tests. Please justify why no new tests are needed for this patch. Also please
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m
59s{color} | {color:green} branch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
23s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m
4s{color} | {color:green} patch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}
16m 18s{color} | {color:green} Patch does not cause any errors with Hadoop
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}103m
34s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
15s{color} | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 43s{color} |
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19726 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12909044/19726.patch |
| Optional Tests | asflicense javac javadoc unit findbugs shadedjars
hadoopcheck hbaseanti checkstyle compile |
| uname | Linux f54c041d6b59 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
|
| git revision | master / 8143d5afa4 |
| maven | version: Apache Maven 3.5.2
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| Test Results |
https://builds.apache.org/job/PreCommit-HBASE-Build/11367/testReport/ |
| Max. process+thread count | 5094 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://builds.apache.org/job/PreCommit-HBASE-Build/11367/console |
| Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
> Failed to start HMaster due to infinite retrying on meta assign
> ---------------------------------------------------------------
>
> Key: HBASE-19726
> URL: https://issues.apache.org/jira/browse/HBASE-19726
> Project: HBase
> Issue Type: Bug
> Reporter: Duo Zhang
> Assignee: stack
> Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: 19726.patch
>
>
> This is what I got at first, an exception when trying to write something to
> meta when meta has not been onlined yet.
> {noformat}
> 2018-01-07,21:03:14,389 INFO org.apache.hadoop.hbase.master.HMaster: Running
> RecoverMetaProcedure to ensure proper hbase:meta deploy.
> 2018-01-07,21:03:14,637 INFO
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure: Start pid=1,
> state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure
> failedMetaServer=null, splitWal=true
> 2018-01-07,21:03:14,645 INFO org.apache.hadoop.hbase.master.MasterWalManager:
> Log folder
> hdfs://c402tst-community/hbase/c402tst-community/WALs/c4-hadoop-tst-st27.bj,38900,1515330173896
> belongs to an existing region server
> 2018-01-07,21:03:14,646 INFO org.apache.hadoop.hbase.master.MasterWalManager:
> Log folder
> hdfs://c402tst-community/hbase/c402tst-community/WALs/c4-hadoop-tst-st29.bj,38900,1515330177232
> belongs to an existing region server
> 2018-01-07,21:03:14,648 INFO
> org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure: pid=1,
> state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure
> failedMetaServer=null, splitWal=true; Retaining meta assignment to server=null
> 2018-01-07,21:03:14,653 INFO
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Initialized
> subprocedures=[{pid=2, ppid=1, state=RUNNABLE:REGION_TRANSITION_QUEUE;
> AssignProcedure table=hbase:meta, region=1588230740}]
> 2018-01-07,21:03:14,660 INFO
> org.apache.hadoop.hbase.master.procedure.MasterProcedureScheduler: pid=2,
> ppid=1, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure
> table=hbase:meta, region=1588230740 hbase:meta hbase:meta,,1.1588230740
> 2018-01-07,21:03:14,663 INFO
> org.apache.hadoop.hbase.master.assignment.AssignProcedure: Start pid=2,
> ppid=1, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure
> table=hbase:meta, region=1588230740; rit=OFFLINE, location=null;
> forceNewPlan=false, retain=false
> 2018-01-07,21:03:14,831 INFO
> org.apache.hadoop.hbase.zookeeper.MetaTableLocator: Setting hbase:meta
> (replicaId=0) location in ZooKeeper as
> c4-hadoop-tst-st27.bj,38900,1515330173896
> 2018-01-07,21:03:14,841 INFO
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Dispatch
> pid=2, ppid=1, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure
> table=hbase:meta, region=1588230740; rit=OPENING,
> location=c4-hadoop-tst-st27.bj,38900,1515330173896
> 2018-01-07,21:03:14,992 INFO
> org.apache.hadoop.hbase.master.procedure.RSProcedureDispatcher: Using
> procedure batch rpc execution for
> serverName=c4-hadoop-tst-st27.bj,38900,1515330173896 version=3145728
> 2018-01-07,21:03:15,593 ERROR
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl: Cannot get replica 0
> location for
> {"totalColumns":1,"row":"hbase:meta","families":{"table":[{"qualifier":"state","vlen":2,"tag":[],"timestamp":1515330195514}]},"ts":1515330195514}
> 2018-01-07,21:03:15,594 WARN
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure:
> Retryable error trying to transition: pid=2, ppid=1,
> state=RUNNABLE:REGION_TRANSITION_FINISH; AssignProcedure table=hbase:meta,
> region=1588230740; rit=OPEN,
> location=c4-hadoop-tst-st27.bj,38900,1515330173896
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1
> action: IOException: 1 time, servers with issues: null
> at
> org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54)
> at
> org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1250)
> at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:457)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:570)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.put(MetaTableAccessor.java:1450)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.putToMetaTable(MetaTableAccessor.java:1439)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.updateTableState(MetaTableAccessor.java:1785)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.updateTableState(MetaTableAccessor.java:1151)
> at
> org.apache.hadoop.hbase.master.TableStateManager.udpateMetaState(TableStateManager.java:183)
> at
> org.apache.hadoop.hbase.master.TableStateManager.setTableState(TableStateManager.java:69)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsOpened(AssignmentManager.java:1515)
> at
> org.apache.hadoop.hbase.master.assignment.AssignProcedure.finishTransition(AssignProcedure.java:271)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:320)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:86)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1456)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1225)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1735)
> {noformat}
> And then I got repeated exception like this infinitely
> {noformat}
> 2018-01-07,21:03:15,596 WARN
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure:
> Retryable error trying to transition: pid=2, ppid=1,
> state=RUNNABLE:REGION_TRANSITION_FINISH; AssignProcedure table=hbase:meta,
> region=1588230740; rit=OPEN,
> location=c4-hadoop-tst-st27.bj,38900,1515330173896
> org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected
> [OFFLINE, CLOSED, SPLITTING, SPLIT, OPENING, FAILED_OPEN] so could move to
> OPEN but current state=OPEN
> at
> org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:155)
> at
> org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsOpened(AssignmentManager.java:1513)
> at
> org.apache.hadoop.hbase.master.assignment.AssignProcedure.finishTransition(AssignProcedure.java:271)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:320)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:86)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1456)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1225)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1735)
> {noformat}
> This is a bit strange. Since we are assigning meta, why we need to write the
> state to meta table?
> I checked the code a bit.
> In AssignProcedure.finishTransition, we will do this
> {code}
> env.getAssignmentManager().markRegionAsOpened(regionNode);
> {code}
> And in AssignmentManager.markRegionAsOpened, we will do this
> {code}
> if (isMetaRegion(hri)) {
> master.getTableStateManager().setTableState(TableName.META_TABLE_NAME,
> TableState.State.ENABLED);
> setMetaInitialized(hri, true);
> }
> {code}
> And in TableStateManager.setTableState, we will call udpateMetaState(a
> typo...) to write something to meta.
> I think this will lead to a dead lock? I do not think we need to put the
> state of meta table to meta table? It is always enabled...
> But I do not know why it worked when I tried to restart the cluster... Maybe
> we do not enter this code path for a non-fresh cluster?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)