[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky

2018-10-12 Thread Vinayakumar B (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-13945:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.3.0
   3.1.2
   3.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-3.2 and branch-3.1

> TestDataNodeVolumeFailure is Flaky
> --
>
> Key: HDFS-13945
> URL: https://issues.apache.org/jira/browse/HDFS-13945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.2.0, 3.1.2, 3.3.0
>
> Attachments: HDFS-13945-01.patch, HDFS-13945-02.patch, 
> HDFS-13945-03.patch
>
>
> The test is failing in trunk since long.
> Reference -
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25135/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25133/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25104/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
>  
> Stack Trace -
>  
> Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 
> 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 
> timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server 
> handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting 
> java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) 
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) 
> "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 
> tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) 
> at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 
> tid=2766 runnable java.lang.Thread.State: RUNNABLE at 
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
> org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at 
> org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) 
> "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon 
> prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 
> tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> 

[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky

2018-09-28 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13945:

Attachment: HDFS-13945-03.patch

> TestDataNodeVolumeFailure is Flaky
> --
>
> Key: HDFS-13945
> URL: https://issues.apache.org/jira/browse/HDFS-13945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13945-01.patch, HDFS-13945-02.patch, 
> HDFS-13945-03.patch
>
>
> The test is failing in trunk since long.
> Reference -
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25135/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25133/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25104/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
>  
> Stack Trace -
>  
> Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 
> 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 
> timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server 
> handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting 
> java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) 
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) 
> "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 
> tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) 
> at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 
> tid=2766 runnable java.lang.Thread.State: RUNNABLE at 
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
> org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at 
> org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) 
> "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon 
> prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 
> tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
> at 
> 

[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky

2018-09-27 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13945:

Attachment: HDFS-13945-02.patch

> TestDataNodeVolumeFailure is Flaky
> --
>
> Key: HDFS-13945
> URL: https://issues.apache.org/jira/browse/HDFS-13945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13945-01.patch, HDFS-13945-02.patch
>
>
> The test is failing in trunk since long.
> Reference -
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25135/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25133/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25104/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
>  
> Stack Trace -
>  
> Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 
> 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 
> timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server 
> handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting 
> java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) 
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) 
> "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 
> tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) 
> at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 
> tid=2766 runnable java.lang.Thread.State: RUNNABLE at 
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
> org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at 
> org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) 
> "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon 
> prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 
> tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
> at 
> 

[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky

2018-09-27 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13945:

Description: 
The test is failing in trunk since long.

Reference -

[https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]

 

[https://builds.apache.org/job/PreCommit-HDFS-Build/25135/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]

 

[https://builds.apache.org/job/PreCommit-HDFS-Build/25133/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]

 

[https://builds.apache.org/job/PreCommit-HDFS-Build/25104/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]

 

 

Stack Trace -

 

Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 
03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 
timed_waiting java.lang.Thread.State: TIMED_WAITING at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server handler 
3 on 34285" daemon prio=5 tid=2646 timed_waiting java.lang.Thread.State: 
TIMED_WAITING at sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) 
"org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 
tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
java.lang.Thread.sleep(Native Method) at 
org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) at 
java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 
tid=2766 runnable java.lang.Thread.State: RUNNABLE at 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at 
org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) 
"org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon 
prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 
tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) 
at java.lang.Thread.run(Thread.java:748) 
"org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeResourceMonitor@63818b03"
 daemon prio=5 tid=2521 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
java.lang.Thread.sleep(Native Method) at 

[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky

2018-09-27 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13945:

Attachment: HDFS-13945-01.patch

> TestDataNodeVolumeFailure is Flaky
> --
>
> Key: HDFS-13945
> URL: https://issues.apache.org/jira/browse/HDFS-13945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13945-01.patch
>
>
> The test is failing in trunk since long.
> Reference -
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> Stack Trace -
>  
> Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 
> 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 
> timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server 
> handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting 
> java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) 
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) 
> "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 
> tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) 
> at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 
> tid=2766 runnable java.lang.Thread.State: RUNNABLE at 
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
> org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at 
> org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) 
> "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon 
> prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 
> tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  at java.lang.Thread.run(Thread.java:748) 
> "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeResourceMonitor@63818b03"
>  daemon prio=5 tid=2521 timed_waiting java.lang.Thread.State: TIMED_WAITING 
> at java.lang.Thread.sleep(Native Method) at 
> 

[jira] [Updated] (HDFS-13945) TestDataNodeVolumeFailure is Flaky

2018-09-27 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13945:

Status: Patch Available  (was: Open)

> TestDataNodeVolumeFailure is Flaky
> --
>
> Key: HDFS-13945
> URL: https://issues.apache.org/jira/browse/HDFS-13945
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13945-01.patch
>
>
> The test is failing in trunk since long.
> Reference -
> [https://builds.apache.org/job/PreCommit-HDFS-Build/25140/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/]
>  
> Stack Trace -
>  
> Timed out waiting for condition. Thread diagnostics: Timestamp: 2018-09-26 
> 03:32:07,162 "IPC Server handler 2 on 33471" daemon prio=5 tid=2931 
> timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) "IPC Server 
> handler 3 on 34285" daemon prio=5 tid=2646 timed_waiting 
> java.lang.Thread.State: TIMED_WAITING at sun.misc.Unsafe.park(Native Method) 
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) 
> at org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:288) at 
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2668) 
> "org.apache.hadoop.util.JvmPauseMonitor$Monitor@1d2ee4cd" daemon prio=5 
> tid=2633 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> java.lang.Thread.sleep(Native Method) at 
> org.apache.hadoop.util.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:192) 
> at java.lang.Thread.run(Thread.java:748) "IPC Server Responder" daemon prio=5 
> tid=2766 runnable java.lang.Thread.State: RUNNABLE at 
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
> org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1334) at 
> org.apache.hadoop.ipc.Server$Responder.run(Server.java:1317) 
> "org.eclipse.jetty.server.session.HashSessionManager@1287fc65Timer" daemon 
> prio=5 tid=2492 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) "qtp548667392-2533" daemon prio=5 
> tid=2533 timed_waiting java.lang.Thread.State: TIMED_WAITING at 
> sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>  at 
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392) 
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
>  at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  at java.lang.Thread.run(Thread.java:748) 
> "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeResourceMonitor@63818b03"
>  daemon prio=5 tid=2521 timed_waiting java.lang.Thread.State: TIMED_WAITING 
> at java.lang.Thread.sleep(Native Method) at 
>