[ 
https://issues.apache.org/jira/browse/HADOOP-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036717#comment-17036717
 ] 

Akira Ajisaka commented on HADOOP-16850:
----------------------------------------

The test result is as follows on my virtual machine (CentOS 7):
{noformat}
#Threads=100, ThreadMXBean=346175 ns, ThreadGroup=77695 ns, ratio: 4
#Threads=200, ThreadMXBean=1018005 ns, ThreadGroup=76144 ns, ratio: 13
#Threads=500, ThreadMXBean=4421633 ns, ThreadGroup=185276 ns, ratio: 23
#Threads=1000, ThreadMXBean=15505098 ns, ThreadGroup=221509 ns, ratio: 69
#Threads=2000, ThreadMXBean=63139330 ns, ThreadGroup=399617 ns, ratio: 157
#Threads=3000, ThreadMXBean=161553471 ns, ThreadGroup=559847 ns, ratio: 288
{noformat}

> Support getting thread info from thread group for JvmMetrics to improve the 
> performance
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16850
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16850
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 2.8.6, 2.9.3, 3.1.4, 3.2.2, 2.10.1, 3.3.1
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Major
>         Attachments: HADOOP-16850.001.patch, HADOOP-16850.002.patch
>
>
> Recently we found jmx request taken almost 5s+ to be done when there were 1w+ 
> threads in a stressed datanode process, meanwhile other http requests were 
> blocked and some disk operations were affected (we can see many "Slow 
> manageWriterOsCache" messages in DN log, and these messages were hard to be 
> seen again after we stopped sending jxm requests)
> The excessive time is spent in getting thread info via ThreadMXBean inside 
> which ThreadImpl#getThreadInfo native method is called, the time complexity 
> of ThreadImpl#getThreadInfo is O(n*n) according to 
> [JDK-8185005|https://bugs.openjdk.java.net/browse/JDK-8185005] and it holds 
> global thread lock and prevents creation or termination of threads.
> To improve this, I propose to support getting thread info from thread group 
> which will improve a lot by default, also support using original approach 
> when "-Dhadoop.metrics.jvm.use-thread-mxbean=true" is configured in the 
> startup command.
> An example of performance tests between these two approaches is as follows:
> {noformat}
> #Threads=100, ThreadMXBean=382372 ns, ThreadGroup=72046 ns, ratio: 5
> #Threads=200, ThreadMXBean=776619 ns, ThreadGroup=83875 ns, ratio: 9
> #Threads=500, ThreadMXBean=3392954 ns, ThreadGroup=216269 ns, ratio: 15
> #Threads=1000, ThreadMXBean=9475768 ns, ThreadGroup=220447 ns, ratio: 42
> #Threads=2000, ThreadMXBean=53833729 ns, ThreadGroup=579608 ns, ratio: 92
> #Threads=3000, ThreadMXBean=196829971 ns, ThreadGroup=1157670 ns, ratio: 170
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to