[
https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299613#comment-16299613
]
Igor Dvorzhak commented on HADOOP-15124:
----------------------------------------
Thank you for feedback.
I would like to migrate FileSystem.Statistics to new StorageStatistics backend.
I will make myself familiar with StorageStatistics code and will see from where
better to start.
Meanwhile, I have reverted changes to public interface in my PR, it uses both
ThreadLocal and LongAdder now.
After this, there no improvement to statistics writes performance, even small
penalty, but it should be negligible, because ThreadLocal.get much more
expensive than LongAdder.add. Still this change allows to get rid of all
complicated and synchronized logic for statistics read, which allows to
decrease Wall time of Statistics code from 6.49% to 1.06% in 1TB TeraGen job
(CPU time increased to 29.4% though, but total runtime still decreased from 66
to 62 minutes).
I think that it could have sense to submit this PR before migration to
StorageStatistics, because it could be patched to 3.0 and 3.1 branches and
provides some performance benefits.
Additionally, I'm thinking that while per-thread statistics is useful it not
used in regular prod-system job runs (I assume it more valuable for performance
tuning and bottlenecks debugging), that's why we can improve statistics writes
performance by introducing property that allows to disable per-thread
statistics. This will allow to achieve performance characteristics of my
initial PR, while preserving all the functionality and backward compatibility.
What do you think?
Another idea, is to extend Thread class (HadoopThread?) and have Statistics
field in it instead of using ThreadLocal - this will allow to achieve much
faster per-thread statistics writes without need to disable them with property,
but it could be more involving change that harder to maintain.
Also, Netty has implemented FastThreadLocal and FastThreadLocalThred classes (
https://netty.io/4.1/api/io/netty/util/concurrent/FastThreadLocal.html ) to
address issue of slow ThreadLocal access which we can consider too, but I like
an idea of dedicated Statistics field in extended Thread class more, because it
will have better performance than even FastThreadLocal implementation.
> Slow FileSystem.Statistics counters implementation
> --------------------------------------------------
>
> Key: HADOOP-15124
> URL: https://issues.apache.org/jira/browse/HADOOP-15124
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: common
> Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
> Reporter: Igor Dvorzhak
> Assignee: Igor Dvorzhak
> Labels: common, filesystem, statistics
>
> While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2
> workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time
> is 5.58% and CPU time is 26.5% of total execution time.
> After switching FileSystem.Statistics implementation to LongAdder, consumed
> Wall time decreased to 0.006% and CPU time to 0.104% of total execution time.
> Total job runtime decreased from 66 mins to 61 mins.
> These results are not conclusive, because I didn't benchmark multiple times
> to average results, but regardless of performance gains switching to
> LongAdder simplifies code and reduces its complexity.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]