[ 
https://issues.apache.org/jira/browse/HADOOP-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762166#comment-13762166
 ] 

Anuj Ojha commented on HADOOP-5707:
-----------------------------------

This looks to be really old bug but I don't see any resolution for this. Has 
this bug been fixed or does anyone know solution to this? I am facing the same 
issue where the MR child is hung.
                
> Hung MR child when closing file systems
> ---------------------------------------
>
>                 Key: HADOOP-5707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5707
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Ben Maurer
>
> I recently found a number of MR processes that had been launched days ago and 
> were stuck in the following situation:
> {quote}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (11.0-b16 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x00000000559ef400 nid=0x2d22 waiting on 
> condition [0x0000000000000000..0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "SIGTERM handler" daemon prio=10 tid=0x00000000563d8400 nid=0x5651 waiting 
> for monitor entry [0x00000000410a5000..0x00000000410a5d10]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at java.lang.Shutdown.exit(Shutdown.java:178)
>       - waiting to lock <0x00002aaaae3a8ed0> (a java.lang.Class for 
> java.lang.Shutdown)
>       at java.lang.Terminator$1.handle(Terminator.java:35)
>       at sun.misc.Signal$1.run(Signal.java:195)
>       at java.lang.Thread.run(Thread.java:619)
> "Thread-2" daemon prio=10 tid=0x00000000563d5000 nid=0x53e5 
> sleeping[0x00000000419db000..0x00000000419dbd90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>       at java.lang.Thread.sleep(Native Method)
>       at org.apache.hadoop.ipc.Client.stop(Client.java:668)
>       at org.apache.hadoop.ipc.RPC$ClientCache.stopClient(RPC.java:189)
>       at org.apache.hadoop.ipc.RPC$ClientCache.access$400(RPC.java:138)
>       at org.apache.hadoop.ipc.RPC$Invoker.close(RPC.java:229)
>       - locked <0x00002aaab9b0eee0> (a org.apache.hadoop.ipc.RPC$Invoker)
>       at org.apache.hadoop.ipc.RPC$Invoker.access$500(RPC.java:196)
>       at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:382)
>       at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:212)
>       - locked <0x00002aaab9b0ee10> (a org.apache.hadoop.hdfs.DFSClient)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:264)
>       at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413)
>       - locked <0x00002aaab9aa9100> (a org.apache.hadoop.fs.FileSystem$Cache)
>       at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236)
>       at 
> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221)
>       - locked <0x00002aaab9a90698> (a 
> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
> "DestroyJavaVM" prio=10 tid=0x000000005589f400 nid=0x4c40 in Object.wait() 
> [0x0000000041cc9000..0x0000000041cc9d40]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x00002aaab9a90698> (a 
> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>       at java.lang.Thread.join(Thread.java:1143)
>       - locked <0x00002aaab9a90698> (a 
> org.apache.hadoop.fs.FileSystem$ClientFinalizer)
>       at java.lang.Thread.join(Thread.java:1196)
>       at 
> java.lang.ApplicationShutdownHooks.run(ApplicationShutdownHooks.java:79)
>       at java.lang.Shutdown.runHooks(Shutdown.java:89)
>       at java.lang.Shutdown.sequence(Shutdown.java:133)
>       at java.lang.Shutdown.shutdown(Shutdown.java:200)
>       - locked <0x00002aaaae3a8ed0> (a java.lang.Class for java.lang.Shutdown)
> "SpillThread" daemon prio=10 tid=0x00002aaac440dc00 nid=0x4c81 waiting on 
> condition [0x0000000041adc000..0x0000000041adcb90]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00002aaab9aaf860> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:882)
> "Comm thread for attempt_200904081633_0288_m_000430_2" daemon prio=10 
> tid=0x0000000055e12000 nid=0x4c80 waiting for monitor entry 
> [0x0000000042a40000..0x0000000042a40a10]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at java.lang.Shutdown.exit(Shutdown.java:178)
>       - waiting to lock <0x00002aaaae3a8ed0> (a java.lang.Class for 
> java.lang.Shutdown)
>       at java.lang.Runtime.exit(Runtime.java:90)
>       at java.lang.System.exit(System.java:906)
>       at org.apache.hadoop.mapred.Task$1.run(Task.java:441)
>       at java.lang.Thread.run(Thread.java:619)
> "CompilerThread1" daemon prio=10 tid=0x000000005593a400 nid=0x4c4e waiting on 
> condition [0x0000000000000000..0x0000000040fa3450]
>    java.lang.Thread.State: RUNNABLE
> "CompilerThread0" daemon prio=10 tid=0x0000000055936400 nid=0x4c4d waiting on 
> condition [0x0000000000000000..0x0000000040ea24d0]
>    java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" daemon prio=10 tid=0x0000000055934800 nid=0x4c4c runnable 
> [0x0000000000000000..0x0000000040da2a60]
>    java.lang.Thread.State: RUNNABLE
> "Finalizer" daemon prio=10 tid=0x0000000055911800 nid=0x4c4b in Object.wait() 
> [0x000000004283e000..0x000000004283eb90]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x00002aaab9a72d40> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
>       - locked <0x00002aaab9a72d40> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
>       at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
> "Reference Handler" daemon prio=10 tid=0x000000005590fc00 nid=0x4c4a in 
> Object.wait() [0x000000004273d000..0x000000004273da10]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       - waiting on <0x00002aaab9a78de0> (a java.lang.ref.Reference$Lock)
>       at java.lang.Object.wait(Object.java:485)
>       at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>       - locked <0x00002aaab9a78de0> (a java.lang.ref.Reference$Lock)
> "VM Thread" prio=10 tid=0x000000005590a400 nid=0x4c49 runnable 
> "GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000558a9c00 nid=0x4c41 
> runnable 
> "GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000558ab800 nid=0x4c42 
> runnable 
> "GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000558ad000 nid=0x4c43 
> runnable 
> "GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000558ae800 nid=0x4c44 
> runnable 
> "GC task thread#4 (ParallelGC)" prio=10 tid=0x00000000558b0400 nid=0x4c45 
> runnable 
> "GC task thread#5 (ParallelGC)" prio=10 tid=0x00000000558b1c00 nid=0x4c46 
> runnable 
> "GC task thread#6 (ParallelGC)" prio=10 tid=0x00000000558b3400 nid=0x4c47 
> runnable 
> "GC task thread#7 (ParallelGC)" prio=10 tid=0x00000000558b5000 nid=0x4c48 
> runnable 
> "VM Periodic Task Thread" prio=10 tid=0x000000005593f400 nid=0x4c50 waiting 
> on condition 
> JNI global references: 1012
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to