[
https://issues.apache.org/jira/browse/HBASE-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376560#comment-16376560
]
Ted Yu commented on HBASE-20081:
--------------------------------
The test failure happened for build #218 :
https://builds.apache.org/job/HBase-2.0-hadoop3-tests/org.apache.hbase$hbase-server/218/testReport/junit/org.apache.hadoop.hbase.master.procedure/TestDisableTableProcedure/org_apache_hadoop_hbase_master_procedure_TestDisableTableProcedure/
However, the archive of the test output was truncated :
{code}
2018-02-25 18:12:02,398 DEBUG [RS_OPEN_REGION-regionserver/asf912:0-1]
regionserver.FlushLargeStoresPolicy(61): No
hbase.hregion.percolumnfamilyflush.size.lower.bound s
...[truncated 1772313 bytes]...
...[truncated 10750 chars]...
{code}
The second truncation was right above RS-EventLoopGroup-3-9 was shown. Thus lot
of relevant information was not recorded.
{code}
java.io.IOException: connection is closed
at
org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:263)
at
org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:761)
{code}
The above was among the truncated logs for the test output.
Let me search for option of not truncating test output.
The test failed not often.
I looped 20 times locally against hadoop 3 which passed. Doing another round of
local test runs.
> TestDisableTableProcedure sometimes hung in MiniHBaseCluster#waitUntilShutDown
> ------------------------------------------------------------------------------
>
> Key: HBASE-20081
> URL: https://issues.apache.org/jira/browse/HBASE-20081
> Project: HBase
> Issue Type: Test
> Reporter: Ted Yu
> Priority: Major
>
> https://builds.apache.org/job/HBase-2.0-hadoop3-tests/lastCompletedBuild/org.apache.hbase$hbase-server/testReport/org.apache.hadoop.hbase.master.procedure/TestDisableTableProcedure/org_apache_hadoop_hbase_master_procedure_TestDisableTableProcedure/
> was one recent occurrence.
> I noticed two things in test output:
> {code}
> 2018-02-25 18:12:45,053 WARN [Time-limited test-EventThread]
> master.RegionServerTracker(136): asf912.gq1.ygridcore.net,45649,1519582305777
> is not online or isn't known to the master.The latter could be caused by a
> DNS misconfiguration.
> {code}
> Since DNS misconfiguration was very unlikely on Apache Jenkins nodes, the
> above should not have been logged.
> {code}
> 2018-02-25 18:16:51,531 WARN [master/asf912:0.Chore.1]
> master.CatalogJanitor(127): Failed scan of catalog table
> java.io.IOException: connection is closed
> at
> org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:263)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:761)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:680)
> at
> org.apache.hadoop.hbase.MetaTableAccessor.scanMetaForTableRegions(MetaTableAccessor.java:675)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:188)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.getMergedRegionsAndSplitParents(CatalogJanitor.java:140)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:246)
> at
> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:119)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
> {code}
> The above was possibly related to the lost region server.
> I searched test output of successful run where none of the above two can be
> seen.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)