[ 
https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793430#comment-13793430
 ] 

Haohui Mai commented on HDFS-5276:
----------------------------------

[~decster], that's quite impressive. It might be worthwhile to generalize the 
approach.

But before that, I'm wondering what would the performance look like if you 
simply mark the variable as volatile? The performance might not be as good the 
thread-local approach since the JVM specification enforces memory fences at 
access of volatile variables, but it is the simplest approach and it can be 
easily generalizable through the code base.

I appreciate if you can test the volatile approach on the same set up and 
report the result. By then we'll have a much better understanding on which 
approach we should use to write the statistics code.

Thanks!

> FileSystem.Statistics got performance issue on multi-thread read/write.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-5276
>                 URL: https://issues.apache.org/jira/browse/HDFS-5276
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.4-alpha
>            Reporter: Chengxiang Li
>            Assignee: Colin Patrick McCabe
>         Attachments: DisableFSReadWriteBytesStat.patch, HDFS-5276.001.patch, 
> HDFS-5276.002.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG, 
> TestFileSystemStatistics.java, ThreadLocalStat.patch
>
>
> FileSystem.Statistics is a singleton variable for each FS scheme, each 
> read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does 
> not perform well in multi-threads(let's say more than 30 threads). so it may 
> cause  serious performance issue. during our spark test profile, 32 threads 
> read data from HDFS, about 70% cpu time is spent on 
> FileSystem.Statistics.incrementBytesRead().



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to