Ryan Wu created HDFS-15487:
------------------------------

             Summary: ScriptBasedMapping lead 100% cpu utilization
                 Key: HDFS-15487
                 URL: https://issues.apache.org/jira/browse/HDFS-15487
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Ryan Wu


We found that sometimes NameNode cpu utilization rate of 90%  leading to 
NameNode hang up. The reason is that flink apps on k8s access HDFS at the same 
time, however their ip and host name is not fixed. So that  run topology script 
at the same time. From jstack file, also found it started several hundreds 
python processes.
{code:java}
// "process reaper" #36159 daemon prio=10 os_prio=0 tid=0x00007fa7a33fa7a0 
nid=0xa3cb waiting on condition [0x00007fa7a61dc000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00007fb4094a0398> (a 
java.util.concurrent.SynchronousQueue$TransferStack)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at 
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
        at 
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
        at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to