[
https://issues.apache.org/jira/browse/HBASE-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13712027#comment-13712027
]
stack commented on HBASE-8939:
------------------------------
I added to apache builds a post build task that runs our zombie tracker from
./dev-tools/test-patch.sh. It caught one just now:
https://builds.apache.org/job/HBase-TRUNK/4265/console
TestLogRollAbort won't shutdown. It is a bit of a strange test in that it
kills hdfs out from under us and tries to ensure we don't lose edits. We are
stuck on a thread join. It looks like it has a timer of two minutes but oddly
the test claims to have 'passed' early enough in the game:
Running org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 147.206 sec
Here is where we are bound up.
{code}"pool-1-thread-1" prio=10 tid=0x7614b400 nid=0x62da in Object.wait()
[0x7774f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x7fd2d268> (a
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
at java.lang.Thread.join(Thread.java:1186)
- locked <0x7fd2d268> (a
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
at java.lang.Thread.join(Thread.java:1239)
at
org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:242)
at
org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:427)
at
org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:495)
at
org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:742)
at
org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:711)
at
org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort.tearDown(TestLogRollAbort.java:114)
{code}
But we are also stuck here in setup:
{code}
"LeaseChecker@DFSClient[clientName=DFSClient_1663452662, ugi=jenkins]:
java.lang.Throwable: for testing
at
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.toString(DFSClient.java:1393)
at org.apache.hadoop.util.Daemon.<init>(Daemon.java:38)
at
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.put(DFSClient.java:1306)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:716)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:476)
at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:361)
at
org.apache.hadoop.hbase.HBaseTestingUtility.createRootDir(HBaseTestingUtility.java:773)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:645)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:627)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:575)
at
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:562)
at
org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort.setUp(TestLogRollAbort.java:102)
{code}
We are doing setup and shutdown when thread dumped.
I'm going to disable this test for now so we get clean builds.
> Hanging unit tests
> ------------------
>
> Key: HBASE-8939
> URL: https://issues.apache.org/jira/browse/HBASE-8939
> Project: HBase
> Issue Type: Bug
> Components: test
> Reporter: stack
> Fix For: 0.95.2
>
> Attachments: 8939.txt
>
>
> We have hanging tests. Here's a few from this morning's review:
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh
> https://builds.apache.org/job/hbase-0.95-on-hadoop2/176/consoleText
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left Speed
> 100 3300k 0 3300k 0 0 508k 0 --:--:-- 0:00:06 --:--:-- 621k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
> {code}
> And...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh
> http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/396/consoleText
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left Speed
> 100 779k 0 779k 0 0 538k 0 --:--:-- 0:00:01 --:--:-- 559k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running
> org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide3
> {code}
> and....
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh
> http://54.241.6.143/job/HBase-0.95/607/consoleText
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left Speed
> 100 445k 0 445k 0 0 490k 0 --:--:-- --:--:-- --:--:-- 522k
> Hanging test: Running
> org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer
> Hanging test: Running org.apache.hadoop.hbase.master.TestAssignmentManager
> Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck
> Hanging test: Running
> org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
> Hanging test: Running
> org.apache.hadoop.hbase.IntegrationTestDataIngestSlowDeterministic
> {code}
> and...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh
> http://54.241.6.143/job/HBase-0.95-Hadoop-2/607/consoleText
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left Speed
> 100 781k 0 781k 0 0 240k 0 --:--:-- 0:00:03 --:--:-- 244k
> Hanging test: Running
> org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running
> org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
> Hanging test: Running
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira