[
https://issues.apache.org/jira/browse/HBASE-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen Yuan Jiang updated HBASE-18458:
---------------------------------------
Description:
The TestRegionServerHostname is passing in branch-1; however, it always fails
locally. Running tests individually always pass. Failing to start RS in some
combination of test run indicates some resource leak.
{code}
Running org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
Tests run: 4, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 46.042 sec <<<
FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
testRegionServerHostnameReportedToMaster(org.apache.hadoop.hbase.regionserver.TestRegionServerHostname)
Time elapsed: 30.095 sec <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 30000
milliseconds
at java.lang.Thread.sleep(Native Method)
at
org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:221)
at
org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:445)
at
org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:225)
at
org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:94)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1072)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1028)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:900)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:894)
at
org.apache.hadoop.hbase.regionserver.TestRegionServerHostname.testRegionServerHostnameReportedToMaster(TestRegionServerHostname.java:158)
{code}
When running the testRegionServerHostnameReportedToMaster alone or with another
newly added test, the test passed without problem.
When running the {{testRegionServerHostnameReportedToMaster}} test with
{{testInvalidRegionServerHostnameAbortsServer}} in the same test suite
{{TestRegionServerHostname}}, the region server failed to start:
{noformat}
2017-07-25 15:34:24,132 FATAL [RS:0;192.168.1.7:64317]
regionserver.HRegionServer(2182): ABORTING region server
192.168.1.7,64317,1501022063917: Unhandled: Failed suppression of fs shutdown
hook: org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@668e0f60
java.lang.RuntimeException: Failed suppression of fs shutdown hook:
org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@668e0f60
at
org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:204)
at
org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:84)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:940)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:307)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
at java.lang.Thread.run(Thread.java:745)
{noformat}
HBASE-17922 addressed similar issue in Hadoop 3. I think this change is more
robust than the one in branch-1 right now. Porting the change to branch-1
(with small modification due to code difference between branch-1 and branch-2)
is a good idea.
was:
The
{code}
Running org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
Tests run: 4, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 46.042 sec <<<
FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
testRegionServerHostnameReportedToMaster(org.apache.hadoop.hbase.regionserver.TestRegionServerHostname)
Time elapsed: 30.095 sec <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 30000
milliseconds
at java.lang.Thread.sleep(Native Method)
at
org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:221)
at
org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:445)
at
org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:225)
at
org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:94)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1072)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1028)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:900)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:894)
at
org.apache.hadoop.hbase.regionserver.TestRegionServerHostname.testRegionServerHostnameReportedToMaster(TestRegionServerHostname.java:158)
{code}
> Refactor TestRegionServerHostname to make it robust (Port HBASE-17922)
> ----------------------------------------------------------------------
>
> Key: HBASE-18458
> URL: https://issues.apache.org/jira/browse/HBASE-18458
> Project: HBase
> Issue Type: Sub-task
> Components: hadoop3
> Affects Versions: 2.0.0
> Reporter: Stephen Yuan Jiang
> Assignee: Stephen Yuan Jiang
> Fix For: 3.0.0, 2.0.0-alpha-2
>
>
> The TestRegionServerHostname is passing in branch-1; however, it always fails
> locally. Running tests individually always pass. Failing to start RS in
> some combination of test run indicates some resource leak.
> {code}
> Running org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
> Tests run: 4, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 46.042 sec
> <<< FAILURE! - in
> org.apache.hadoop.hbase.regionserver.TestRegionServerHostname
> testRegionServerHostnameReportedToMaster(org.apache.hadoop.hbase.regionserver.TestRegionServerHostname)
> Time elapsed: 30.095 sec <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 30000
> milliseconds
> at java.lang.Thread.sleep(Native Method)
> at
> org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:221)
> at
> org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:445)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:225)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:94)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1072)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1028)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:900)
> at
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:894)
> at
> org.apache.hadoop.hbase.regionserver.TestRegionServerHostname.testRegionServerHostnameReportedToMaster(TestRegionServerHostname.java:158)
> {code}
> When running the testRegionServerHostnameReportedToMaster alone or with
> another newly added test, the test passed without problem.
> When running the {{testRegionServerHostnameReportedToMaster}} test with
> {{testInvalidRegionServerHostnameAbortsServer}} in the same test suite
> {{TestRegionServerHostname}}, the region server failed to start:
> {noformat}
> 2017-07-25 15:34:24,132 FATAL [RS:0;192.168.1.7:64317]
> regionserver.HRegionServer(2182): ABORTING region server
> 192.168.1.7,64317,1501022063917: Unhandled: Failed suppression of fs shutdown
> hook: org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@668e0f60
> java.lang.RuntimeException: Failed suppression of fs shutdown hook:
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@668e0f60
> at
> org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:204)
> at
> org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:84)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:940)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
> at
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:307)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> HBASE-17922 addressed similar issue in Hadoop 3. I think this change is more
> robust than the one in branch-1 right now. Porting the change to branch-1
> (with small modification due to code difference between branch-1 and
> branch-2) is a good idea.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)