[ 
https://issues.apache.org/jira/browse/HBASE-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13712027#comment-13712027
 ] 

stack commented on HBASE-8939:
------------------------------

I added to apache builds a post build task that runs our zombie tracker from 
./dev-tools/test-patch.sh.  It caught one just now:

https://builds.apache.org/job/HBase-TRUNK/4265/console

TestLogRollAbort won't shutdown.  It is a bit of a strange test in that it 
kills hdfs out from under us and tries to ensure we don't lose edits.  We are 
stuck on a thread join.  It looks like it has a timer of two minutes but oddly 
the test claims to have 'passed' early enough in the game:

Running org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 147.206 sec

Here is where we are bound up.

{code}"pool-1-thread-1" prio=10 tid=0x7614b400 nid=0x62da in Object.wait() 
[0x7774f000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x7fd2d268> (a 
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
        at java.lang.Thread.join(Thread.java:1186)
        - locked <0x7fd2d268> (a 
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
        at java.lang.Thread.join(Thread.java:1239)
        at 
org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:242)
        at 
org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:427)
        at 
org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:495)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:742)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:711)
        at 
org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort.tearDown(TestLogRollAbort.java:114)

{code}

But we are also stuck here in setup:

{code}
"LeaseChecker@DFSClient[clientName=DFSClient_1663452662, ugi=jenkins]: 
java.lang.Throwable: for testing
        at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.toString(DFSClient.java:1393)
        at org.apache.hadoop.util.Daemon.<init>(Daemon.java:38)
        at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.put(DFSClient.java:1306)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:716)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:536)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:443)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:435)
        at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:476)
        at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:361)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.createRootDir(HBaseTestingUtility.java:773)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:645)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:627)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:575)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:562)
        at 
org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort.setUp(TestLogRollAbort.java:102)

{code}

We are doing setup and shutdown when thread dumped.

I'm going to disable this test for now so we get clean builds.
                
> Hanging unit tests
> ------------------
>
>                 Key: HBASE-8939
>                 URL: https://issues.apache.org/jira/browse/HBASE-8939
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>            Reporter: stack
>             Fix For: 0.95.2
>
>         Attachments: 8939.txt
>
>
> We have hanging tests.  Here's a few from this morning's review:
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  
> https://builds.apache.org/job/hbase-0.95-on-hadoop2/176/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100 3300k    0 3300k    0     0   508k      0 --:--:--  0:00:06 --:--:--  621k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
> {code}
> And...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh 
> http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/396/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  779k    0  779k    0     0   538k      0 --:--:--  0:00:01 --:--:--  559k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running 
> org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide3
> {code}
> and....
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  
> http://54.241.6.143/job/HBase-0.95/607/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  445k    0  445k    0     0   490k      0 --:--:-- --:--:-- --:--:--  522k
> Hanging test: Running 
> org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer
> Hanging test: Running org.apache.hadoop.hbase.master.TestAssignmentManager
> Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck
> Hanging test: Running 
> org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
> Hanging test: Running 
> org.apache.hadoop.hbase.IntegrationTestDataIngestSlowDeterministic
> {code}
> and...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  
> http://54.241.6.143/job/HBase-0.95-Hadoop-2/607/consoleText
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  781k    0  781k    0     0   240k      0 --:--:--  0:00:03 --:--:--  244k
> Hanging test: Running 
> org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running 
> org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
> Hanging test: Running 
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to