[ https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523933#comment-16523933 ]
Todd Lipcon commented on HADOOP-15124: -------------------------------------- Before seeing this JIRA I also happened to have spent some time on this same perf issue. My approach was just to micro-optimize the existing stats implementation: - use a simple array to iterate over the FS stats instead of iterating over a HashSet (the latter involves a much more complex iterator) - out-of-line the unlikely path for threadlocal (improve inlining) - get rid of the visitor abstraction for visiting stats objects (it wasn't getting escape-analyzed out or inlined, and was also causing actual boxing of Longs) In my teragen tests this also reduced the statistics to a small fraction of the profile. I didn't compare vs a LongAdder approach, though. My patch is at: https://github.com/toddlipcon/hadoop-common/commit/e5bedddabbb9e8729b2f58165f0849c30e2be346 > Slow FileSystem.Statistics counters implementation > -------------------------------------------------- > > Key: HADOOP-15124 > URL: https://issues.apache.org/jira/browse/HADOOP-15124 > Project: Hadoop Common > Issue Type: Sub-task > Components: common > Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0 > Reporter: Igor Dvorzhak > Assignee: Igor Dvorzhak > Priority: Major > Labels: common, filesystem, fs, statistics > Attachments: HADOOP-15124.001.patch > > > While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 > workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time > is 5.58% and CPU time is 26.5% of total execution time. > After switching FileSystem.Statistics implementation to LongAdder, consumed > Wall time decreased to 0.006% and CPU time to 0.104% of total execution time. > Total job runtime decreased from 66 mins to 61 mins. > These results are not conclusive, because I didn't benchmark multiple times > to average results, but regardless of performance gains switching to > LongAdder simplifies code and reduces its complexity. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org