[ 
https://issues.apache.org/jira/browse/HBASE-18167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057039#comment-16057039
 ] 

Hadoop QA commented on HBASE-18167:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 33m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 25s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
20s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
31s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
56s {color} | {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 32s 
{color} | {color:red} hbase-client in branch-1 has 5 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 31s 
{color} | {color:red} hbase-server in branch-1 has 18 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s 
{color} | {color:green} branch-1 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 16s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 32s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 148m 13s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
43s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 250m 30s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.regionserver.TestCompactionInDeadRegionServer |
|   | hadoop.hbase.regionserver.TestRSKilledWhenInitializing |
|   | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 |
|   | hadoop.hbase.master.TestMasterBalanceThrottling |
|   | hadoop.hbase.TestRegionRebalancing |
|   | hadoop.hbase.TestPartialResultsFromClientSide |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:395d9a0 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12873777/HBASE-18167-branch-1-V2.patch
 |
| JIRA Issue | HBASE-18167 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux bdc3c7cf1fe6 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | branch-1 / 532e0dd |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7266/artifact/patchprocess/branch-findbugs-hbase-client-warnings.html
 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7266/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7266/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/7266/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7266/testReport/ |
| modules | C: hbase-client hbase-server U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7266/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> OfflineMetaRepair tool may cause HMaster abort always
> -----------------------------------------------------
>
>                 Key: HBASE-18167
>                 URL: https://issues.apache.org/jira/browse/HBASE-18167
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.4.0, 1.3.1, 1.3.2
>            Reporter: Pankaj Kumar
>            Assignee: Pankaj Kumar
>            Priority: Critical
>             Fix For: 1.4.0, 1.3.2
>
>         Attachments: HBASE-18167.branch-1.3.V2.patch, 
> HBASE-18167-branch-1.patch, HBASE-18167-branch-1-V2.patch
>
>
> In the production environment, we met a weird scenario where some Meta table 
> HFile blocks were missing due to some reason.
> To recover the environment we tried to rebuild the meta using 
> OfflineMetaRepair tool and restart the cluster, but HMaster couldn't finish 
> it's initialization. It always timed out as namespace table region was never 
> assigned.
> Steps to reproduce
> ==================
> 1. Assign meta table region to HMaster (it can be on any RS, just to 
> reproduce the  scenario)
> {noformat}
>       <property>
>             <name>hbase.balancer.tablesOnMaster</name>
>             <value>hbase:meta</value>
>         </property>
> {noformat}
> 2. Start HMaster and RegionServer
> 2. Create two namespace, say "ns1" & "ns2"
> 3. Create two tables "ns1:t1' & "ns2:t1'
> 4. flush 'hbase:meta"
> 5. Stop HMaster (graceful shutdown)
> 6. Kill -9 RegionServer (Abnormal shutdown)
> 7. Run OfflineMetaRepair as follows,
> {noformat}
>       hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair -fix
> {noformat}
> 8. Restart HMaster and RegionServer
> 9. HMaster will never be able to finish its initialization and abort always 
> with below message,
> {code}
> 2017-06-06 15:11:07,582 FATAL [Hostname:16000.activeMasterManager] 
> master.HMaster: Unhandled exception. Starting shutdown.
> java.io.IOException: Timedout 120000ms waiting for namespace table to be 
> assigned
>         at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
>         at 
> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:1054)
>         at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
>         at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:199)
>         at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1871)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> Root cause
> ==========
> 1. During HM start up AM assumes that it's a failover scenario based on the 
> existing old WAL files, so SSH/SCP will split WAL files and assign the 
> holding regions. 
> 2. During SSH/SCP it retrieves the server holding regions from meta/AM's 
> in-memory-state, but meta only had "regioninfo" entry (as already rebuild by 
> OfflineMetaRepair). So empty region will be returned and it wont trigger any 
> assignment.
> 3. HMaster which is waiting for namespace table to be assigned will timeout 
> and abort always.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to