[
https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363464#comment-16363464
]
stack commented on HBASE-19965:
-------------------------------
Here is for the test that timed out, build 314:
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.client.TestAsyncTableAdminApi
-------------------------------------------------------------------------------
Tests run: 30, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 574.271 s <<<
FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncTableAdminApi
org.apache.hadoop.hbase.client.TestAsyncTableAdminApi Time elapsed: 8.443 s
<<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds
org.apache.hadoop.hbase.client.TestAsyncTableAdminApi Time elapsed: 8.473 s
<<< ERROR!
java.lang.Exception: Appears to be stuck in thread DataXceiver for client
DFSClient_NONMAPREDUCE_1381247601_23 at /127.0.0.1:40966 [Receiving block
BP-1735548202-172.17.0.2-1518565636532:blk_1073741829_1005]
Parameterized there are about 29 tests. None takes a particularly long time:
{code}
1 2018-02-13 23:47:40,557 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableWithEmptyRowInTheSplitKeys[0]
Thread=302, OpenFileDescriptor=1612, MaxFileDescriptor=1048576,
SystemLoadAverage=2007, ProcessCount=17, AvailableMemoryM B=21888
2 2018-02-13 23:47:40,659 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testDeleteTable[0] Thread=305,
OpenFileDescriptor=1614, MaxFileDescriptor=1048576, SystemLoadAverage=1982,
ProcessCount=17, AvailableMemoryMB=21890
3 2018-02-13 23:47:51,503 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testDisableAndEnableTables[0] Thread=320,
OpenFileDescriptor=1616, MaxFileDescriptor=1048576, SystemLoadAverage=1827,
ProcessCount=17, AvailableMemoryMB=21940
4 2018-02-13 23:48:32,308 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTable[0] Thread=344,
OpenFileDescriptor=1589, MaxFileDescriptor=1048576, SystemLoadAverage=1798,
ProcessCount=17, AvailableMemoryMB=22132
5 2018-02-13 23:48:41,898 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableWithRegions[0] Thread=344,
OpenFileDescriptor=1586, MaxFileDescriptor=1048576, SystemLoadAverage=1867,
ProcessCount=17, AvailableMemoryMB=21816
6 2018-02-13 23:49:23,348 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testIsTableEnabledAndDisabled[0] Thread=435,
OpenFileDescriptor=1571, MaxFileDescriptor=1048576, SystemLoadAverage=1732,
ProcessCount=17, AvailableMemoryMB=21124
7 2018-02-13 23:49:36,012 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testListTables[0] Thread=441,
OpenFileDescriptor=1577, MaxFileDescriptor=1048576, SystemLoadAverage=1809,
ProcessCount=17, AvailableMemoryMB=20713
8 2018-02-13 23:50:08,285 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testTruncateTablePreservingSplits[0] Thread=375,
OpenFileDescriptor=1567, MaxFileDescriptor=1048576, SystemLoadAverage=1902,
ProcessCount=17, AvailableMemoryMB=20604
9 2018-02-13 23:50:26,969 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableNumberOfRegions[0] Thread=373,
OpenFileDescriptor=1590, MaxFileDescriptor=1048576, SystemLoadAverage=1889,
ProcessCount=17, AvailableMemoryMB=20979
10 2018-02-13 23:51:18,514 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testDisableAndEnableTable[0] Thread=439,
OpenFileDescriptor=1569, MaxFileDescriptor=1048576, SystemLoadAverage=1876,
ProcessCount=17, AvailableMemoryMB=20712
11 2018-02-13 23:51:40,787 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testEnableTableRetainAssignment[0] Thread=430,
OpenFileDescriptor=1575, MaxFileDescriptor=1048576, SystemLoadAverage=1754,
ProcessCount=17, AvailableMemoryMB=20766
12 2018-02-13 23:51:59,198 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testTruncateTable[0] Thread=473,
OpenFileDescriptor=1578, MaxFileDescriptor=1048576, SystemLoadAverage=1826,
ProcessCount=17, AvailableMemoryMB=21183
13 2018-02-13 23:52:17,821 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testGetTableDescriptor[0] Thread=446,
OpenFileDescriptor=1576, MaxFileDescriptor=1048576, SystemLoadAverage=1827,
ProcessCount=17, AvailableMemoryMB=21732
14 2018-02-13 23:52:27,280 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testTableExist[0] Thread=446,
OpenFileDescriptor=1576, MaxFileDescriptor=1048576, SystemLoadAverage=1855,
ProcessCount=17, AvailableMemoryMB=21342
15 2018-02-13 23:52:38,715 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableWithOnlyEmptyStartRow[0]
Thread=459, OpenFileDescriptor=1581, MaxFileDescriptor=1048576,
SystemLoadAverage=1814, ProcessCount=17, AvailableMemoryMB=205 98
16 2018-02-13 23:52:38,764 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableWithEmptyRowInTheSplitKeys[1]
Thread=459, OpenFileDescriptor=1581, MaxFileDescriptor=1048576,
SystemLoadAverage=1814, ProcessCount=17, AvailableMemoryM B=20595
17 2018-02-13 23:52:38,813 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testDeleteTable[1] Thread=459,
OpenFileDescriptor=1581, MaxFileDescriptor=1048576, SystemLoadAverage=1814,
ProcessCount=17, AvailableMemoryMB=20594
18 2018-02-13 23:52:50,184 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testDisableAndEnableTables[1] Thread=396,
OpenFileDescriptor=1580, MaxFileDescriptor=1048576, SystemLoadAverage=1760,
ProcessCount=19, AvailableMemoryMB=20454
19 2018-02-13 23:53:25,670 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTable[1] Thread=370,
OpenFileDescriptor=1574, MaxFileDescriptor=1048576, SystemLoadAverage=1660,
ProcessCount=17, AvailableMemoryMB=20662
20 2018-02-13 23:53:37,265 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableWithRegions[1] Thread=366,
OpenFileDescriptor=1576, MaxFileDescriptor=1048576, SystemLoadAverage=1664,
ProcessCount=17, AvailableMemoryMB=22533
21 2018-02-13 23:54:10,781 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testIsTableEnabledAndDisabled[1] Thread=456,
OpenFileDescriptor=1573, MaxFileDescriptor=1048576, SystemLoadAverage=1792,
ProcessCount=17, AvailableMemoryMB=19696
22 2018-02-13 23:54:22,740 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testListTables[1] Thread=455,
OpenFileDescriptor=1577, MaxFileDescriptor=1048576, SystemLoadAverage=1648,
ProcessCount=17, AvailableMemoryMB=19641
23 2018-02-13 23:54:56,100 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testTruncateTablePreservingSplits[1] Thread=387,
OpenFileDescriptor=1567, MaxFileDescriptor=1048576, SystemLoadAverage=1639,
ProcessCount=17, AvailableMemoryMB=20829
24 2018-02-13 23:55:15,611 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testCreateTableNumberOfRegions[1] Thread=383,
OpenFileDescriptor=1596, MaxFileDescriptor=1048576, SystemLoadAverage=1636,
ProcessCount=20, AvailableMemoryMB=20761
25 2018-02-13 23:55:54,008 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testDisableAndEnableTable[1] Thread=475,
OpenFileDescriptor=1569, MaxFileDescriptor=1048576, SystemLoadAverage=1807,
ProcessCount=18, AvailableMemoryMB=22096
26 2018-02-13 23:56:12,884 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testEnableTableRetainAssignment[1] Thread=454,
OpenFileDescriptor=1573, MaxFileDescriptor=1048576, SystemLoadAverage=1775,
ProcessCount=17, AvailableMemoryMB=20974
27 2018-02-13 23:56:35,923 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testTruncateTable[1] Thread=485,
OpenFileDescriptor=1575, MaxFileDescriptor=1048576, SystemLoadAverage=1652,
ProcessCount=17, AvailableMemoryMB=20504
28 2018-02-13 23:56:52,508 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testGetTableDescriptor[1] Thread=462,
OpenFileDescriptor=1586, MaxFileDescriptor=1048576, SystemLoadAverage=1705,
ProcessCount=17, AvailableMemoryMB=20830
29 2018-02-13 23:57:06,321 INFO [Time-limited test]
hbase.ResourceChecker(148): before:
client.TestAsyncTableAdminApi#testTableExist[1] Thread=456,
OpenFileDescriptor=1569, MaxFileDescriptor=1048576, SystemLoadAverage=1984,
ProcessCount=17, AvailableMemoryMB=20508
~
{code}
Could work on startup times. Would help. Let me split out a third class.
> Fix flaky TestAsyncRegionAdminApi
> ---------------------------------
>
> Key: HBASE-19965
> URL: https://issues.apache.org/jira/browse/HBASE-19965
> Project: HBase
> Issue Type: Sub-task
> Reporter: Guanghao Zhang
> Assignee: stack
> Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19965.branch-2.001.patch
>
>
> See
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/]
>
> java.lang.AssertionError: expected:<2> but was:<3> at
> org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359)
>
> Merge regions not work. The table still have 3 regions after the
> MergeRegionsProcedure finished.
> The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because
> the MergeRegionsProcedure pid=138 start work first, so the balance need wait
> for the lock. But after merge regions finished, the MoveRegionProcedure
> pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new
> region server. This is not right. The MoveRegionProcedure should skip to
> assign a region which was marked as offline. Or we should clear the merged
> regions' procedure when MergeRegionsProcedure finished.
>
> Logs:
> 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1]
> master.HMaster(1454): balance
> hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4.,
> source=cd4730e3eae2,39077,1518106776411,
> destination=cd4730e3eae2,40578,1518106776318
> 2018-02-08 16:24:44,608 DEBUG
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885]
> procedure2.ProcedureExecutor(868): Stored pid=138,
> state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure
> table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4,
> 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false
> ......
> 2018-02-08 16:24:50,111 INFO [PEWorker-13]
> procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS;
> MergeTableRegionsProcedure table=testMergeRegions,
> regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479],
> forcibly=false in 5.5710sec
> 2018-02-08 16:24:50,113 INFO [PEWorker-13]
> procedure.MasterProcedureScheduler(813): pid=139,
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure
> hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4.,
> source=cd4730e3eae2,39077,1518106776411,
> destination=cd4730e3eae2,40578,1518106776318 testMergeRegions
> testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)