[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793297#comment-13793297 ]
Binglin Chang commented on HDFS-5276: ------------------------------------- Did a micro-benchmark only on FileSystem.Statistics, results seams great, looks like thread local have very little performance penalty. Without patch: Thread 1, Time: 1107 Thread 2, Time: 11391 Thread 3, Time: 23813 Thread 4, Time: 37780 With patch: Thread 1, Time: 901 Thread 2, Time: 1056 Thread 3, Time: 2473 Thread 4, Time: 2525 Thread 5, Time: 2689 Thread 6, Time: 2634 Thread 7, Time: 2938 Thread 8, Time: 3499 Thread 9, Time: 3551 My test env (i7 4core 8 thread hyperthreading) should have linear scalability under 4 threads, don't know why we still see 2x slow down on 3 and 4 threads. Don't have more cores test env, maybe [~chengxiang li] can provide more results? Attach test code. > FileSystem.Statistics got performance issue on multi-thread read/write. > ----------------------------------------------------------------------- > > Key: HDFS-5276 > URL: https://issues.apache.org/jira/browse/HDFS-5276 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.0.4-alpha > Reporter: Chengxiang Li > Assignee: Colin Patrick McCabe > Attachments: DisableFSReadWriteBytesStat.patch, HDFS-5276.001.patch, > HDFS-5276.002.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG, > ThreadLocalStat.patch > > > FileSystem.Statistics is a singleton variable for each FS scheme, each > read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does > not perform well in multi-threads(let's say more than 30 threads). so it may > cause serious performance issue. during our spark test profile, 32 threads > read data from HDFS, about 70% cpu time is spent on > FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)