[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-10-13 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573368#comment-15573368
 ] 

Mikhail Antonov commented on HBASE-16349:
-

Yeah, that would be good. It creates noise on Jenkins builds. We used to have 
1.3 mostly blue some time ago, but the flaky tests creeped in at some point.

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-10-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573354#comment-15573354
 ] 

Ted Yu commented on HBASE-16349:


Do you want it in 1.3 ?

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-10-13 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573346#comment-15573346
 ] 

Mikhail Antonov commented on HBASE-16349:
-

This test seems to be failing on 1.3 recently - see for example 
https://builds.apache.org/job/HBase-1.3-JDK8/org.apache.hbase$hbase-server/43/testReport/junit/org.apache.hadoop.hbase.regionserver/TestClusterId/testClusterId/

cc [~te...@apache.org] seems like this patch was only committed to 1.4 and 
master

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-10-04 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15547426#comment-15547426
 ] 

Mikhail Antonov commented on HBASE-16349:
-

Does it only affect branch-1 but not branch-1.3?

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496909#comment-15496909
 ] 

Hudson commented on HBASE-16349:


FAILURE: Integrated in Jenkins build HBase-1.4 #418 (See 
[https://builds.apache.org/job/HBase-1.4/418/])
HBASE-16349 TestClusterId may hang during cluster shutdown (tedyu: rev 
591cc4cfb8c908556eb6e64811aa7b46fb8bbb7a)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestClusterId.java


> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-09-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496619#comment-15496619
 ] 

Hudson commented on HBASE-16349:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1613 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1613/])
HBASE-16349 TestClusterId may hang during cluster shutdown (tedyu: rev 
2597217ae5aa057e1931c772139ce8cc7a2b3efb)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestClusterId.java


> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495534#comment-15495534
 ] 

Hadoop QA commented on HBASE-16349:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 3s 
{color} | {color:blue} The patch file was not named according to hbase's naming 
conventions. Please see 
https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for 
instructions. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
48s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 29s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 26s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 129m 14s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hbase.client.TestFromClientSide |
|   | org.apache.hadoop.hbase.client.TestScannerTimeout |
|   | 
org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas |
|   | org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient |
|   | org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828793/16349.v1.txt |
| JIRA Issue | HBASE-16349 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux be07317fd0ef 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / e19632a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/3570/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  

[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-09-15 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495347#comment-15495347
 ] 

Heng Chen commented on HBASE-16349:
---

+1.  Try it.

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: 16349.branch-1.v1.txt, 16349.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-09-15 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495322#comment-15495322
 ] 

Ted Yu commented on HBASE-16349:


In this occurrence 
https://builds.apache.org/job/HBase-1.4/jdk=JDK_1_8,label=yahoo-not-h2/416/console
 :
{code}
"Thread-3" #26 prio=5 os_prio=0 tid=0x7fe90c01c000 nid=0xdbd in 
Object.wait() [0x7fe80d2d3000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
- locked <0x000717dea2e0> (a 
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
at org.apache.hadoop.hbase.util.Threads.shutdown(Threads.java:111)
at org.apache.hadoop.hbase.util.Threads.shutdown(Threads.java:99)
at 
org.apache.hadoop.hbase.regionserver.ShutdownHook$ShutdownHookThread.run(ShutdownHook.java:115)
...
"main" #1 prio=5 os_prio=0 tid=0x7fe9ac009000 nid=0x5e8 in Object.wait() 
[0x7fe9b4de1000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
- locked <0x000717dea2e0> (a 
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
at java.lang.Thread.join(Thread.java:1323)
at 
org.apache.hadoop.hbase.regionserver.TestClusterId.tearDown(TestClusterId.java:65)
{code}
Looks like ShutdownHook was playing a role in the hang.

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: 16349.branch-1.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
> 

[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-08-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407914#comment-15407914
 ] 

Ted Yu commented on HBASE-16349:


TestClusterId got stuck in tearDown:
https://builds.apache.org/job/HBase-1.4/330/jdk=latest1.7,label=yahoo-not-h2/console

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: 16349.branch-1.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2085): ABORTING region server 
> cn012.l42scl.hortonworks.com,49371,1470249187586: Unrecoverable   
> exception while closing region hbase:meta,,1.1588230740, still finishing close
> org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append 
> sequenceId=8, requesting roll of WAL
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1902)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1754)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1676)
>   at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,603 FATAL [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegionServer(2093): RegionServer abort: loaded coprocessors 
> are: [org.apache.hadoop.hbase.coprocessor.   MultiRowMutationEndpoint]
> {code}
> This led to rst.join() hanging:
> {code}
> "RS:0;cn012:49371" #648 prio=5 os_prio=0 tid=0x7fdab24b5000 nid=0x621a 
> waiting on condition [0x7fd785fe]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.sleep(HRegionServer.java:1326)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:1312)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1082)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-08-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406734#comment-15406734
 ] 

Ted Yu commented on HBASE-16349:


Test failure was due to cluster unable to start:
{code}
org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: The 
region 3043e95376d0c15ccbd93f71fddb69fa was already closing. New CLOSE request 
is ignored.
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2867)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.closeRegion(RSRpcServices.java:1240)
at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22741)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2264)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:118)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:189)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:169)

at sun.reflect.GeneratedConstructorAccessor22.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:332)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.closeRegion(ProtobufUtil.java:1759)
at 
org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:837)
at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1890)
at 
org.apache.hadoop.hbase.master.AssignmentManager.forceRegionStateToOffline(AssignmentManager.java:2014)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1620)
at 
org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:48)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException):
 org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: The 
region 3043e95376d0c15ccbd93f71fddb69fa was already closing. New CLOSE request 
is ignored.
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2867)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.closeRegion(RSRpcServices.java:1240)
at 
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22741)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2264)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:118)
{code}

> TestClusterId may hang during cluster shutdown
> --
>
> Key: HBASE-16349
> URL: https://issues.apache.org/jira/browse/HBASE-16349
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
> Attachments: 16349.branch-1.v1.txt
>
>
> I was running TestClusterId on branch-1 where I observed the test hang during 
> test tearDown().
> {code}
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1415): Closing hbase:meta,,1.1588230740: disabling 
> compactions & flushes
> 2016-08-03 11:36:39,600 DEBUG [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(1442): Updates disabled for region 
> hbase:meta,,1.1588230740
> 2016-08-03 11:36:39,600 INFO  [RS_CLOSE_META-cn012:49371-0] 
> regionserver.HRegion(2253): Flushing 1/1 column families, memstore=232 B
> 2016-08-03 11:36:39,601 WARN  [RS_OPEN_META-cn012:49371-0.append-pool17-t1] 
> wal.FSHLog$RingBufferEventHandler(1900): Append sequenceId=8, requesting roll 
> of WAL
> java.io.IOException: All datanodes 
> DatanodeInfoWithStorage[127.0.0.1:37765,DS-9870993e-fb98-45fc-b151-708f72aa02d2,DISK]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1113)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402)
> 2016-08-03 11:36:39,602 FATAL [RS_CLOSE_META-cn012:49371-0] 
> 

[jira] [Commented] (HBASE-16349) TestClusterId may hang during cluster shutdown

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406725#comment-15406725
 ] 

Hadoop QA commented on HBASE-16349:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 2s 
{color} | {color:blue} The patch file was not named according to hbase's naming 
conventions. Please see 
https://yetus.apache.org/documentation/0.2.1/precommit-patchnames for 
instructions. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
4s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} branch-1 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
12s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
59s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s 
{color} | {color:green} branch-1 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
17m 22s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 52s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 122m 54s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestClusterId |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821914/16349.branch-1.v1.txt 
|
| JIRA Issue | HBASE-16349 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |