[ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794740#comment-13794740 ]
Colin Patrick McCabe commented on HDFS-5276: -------------------------------------------- thanks for the review, Andrew. bq. FileSystem#reset is now resetting all the statistics, not just bytes written and read. This makes sense to me. It looks like the current behavior is just an oversight from when the additional statistics were added in HADOOP-6859; maybe Suresh Srinivas as the original author can comment. Yeah, that's what I think too. Agree that it would be nice to get some comments on that. bq. Resetting by taking the current total, #negate-ing, then adding is not immediately obvious to the reader. Is there a reason we don't just have a #reset method and reset each StatisticData in turn? There is a reason. We can't modify the thread-local data... only the threads themselves can do that, or else there's a race condition. I added a comment explaining this better. bq. I'd also prefer just to not reuse the visitor / visitAll [in reset] too, since it's kind of weird semantically: #total() mutates the total StatisticsData member, and it depends on #total() being called exactly once at the end. I actually thought it was a nice example of code reuse. Iterating over all the {{StatisticsData}} objects is complex and I would not want to duplicate that code. I added a comment to {{visitAll}} that describes the fact that {{aggregate}} (previously called {{total}}) is called exactly once at the end-- hopefully that makes it more obvious. > FileSystem.Statistics got performance issue on multi-thread read/write. > ----------------------------------------------------------------------- > > Key: HDFS-5276 > URL: https://issues.apache.org/jira/browse/HDFS-5276 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.0.4-alpha > Reporter: Chengxiang Li > Assignee: Colin Patrick McCabe > Attachments: DisableFSReadWriteBytesStat.patch, HDFS-5276.001.patch, > HDFS-5276.002.patch, HDFSStatisticTest.java, hdfs-test.PNG, jstack-trace.PNG, > TestFileSystemStatistics.java, ThreadLocalStat.patch > > > FileSystem.Statistics is a singleton variable for each FS scheme, each > read/write on HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does > not perform well in multi-threads(let's say more than 30 threads). so it may > cause serious performance issue. during our spark test profile, 32 threads > read data from HDFS, about 70% cpu time is spent on > FileSystem.Statistics.incrementBytesRead(). -- This message was sent by Atlassian JIRA (v6.1#6144)