[ 
https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299613#comment-16299613
 ] 

Igor Dvorzhak commented on HADOOP-15124:
----------------------------------------

Thank you for feedback.

I would like to migrate FileSystem.Statistics to new StorageStatistics backend.
I will make myself familiar with StorageStatistics code and will see from where 
better to start.

Meanwhile, I have reverted changes to public interface in my PR, it uses both 
ThreadLocal and LongAdder now.

After this, there no improvement to statistics writes performance, even small 
penalty, but it should be negligible, because ThreadLocal.get much more 
expensive than LongAdder.add. Still this change allows to get rid of all 
complicated and synchronized logic for statistics read, which allows to 
decrease Wall time of Statistics code from 6.49% to 1.06% in 1TB TeraGen job 
(CPU time increased to 29.4% though, but total runtime still decreased from 66 
to 62 minutes).

I think that it could have sense to submit this PR before migration to 
StorageStatistics, because it could be patched to 3.0 and 3.1 branches and 
provides some performance benefits.

Additionally, I'm thinking that while per-thread statistics is useful it not 
used in regular prod-system job runs (I assume it more valuable for performance 
tuning and bottlenecks debugging), that's why we can improve statistics writes 
performance by introducing property that allows to disable per-thread 
statistics. This will allow to achieve performance characteristics of my 
initial PR, while preserving all the functionality and backward compatibility. 
What do you think?

Another idea, is to extend Thread class (HadoopThread?) and have Statistics 
field in it instead of using ThreadLocal - this will allow to achieve much 
faster per-thread statistics writes without need to disable them with property, 
but it could be more involving change that harder to maintain.

Also, Netty has implemented FastThreadLocal and FastThreadLocalThred classes ( 
https://netty.io/4.1/api/io/netty/util/concurrent/FastThreadLocal.html ) to 
address issue of slow ThreadLocal access which we can consider too, but I like 
an idea of dedicated Statistics field in extended Thread class more, because it 
will have better performance than even FastThreadLocal implementation.

> Slow FileSystem.Statistics counters implementation
> --------------------------------------------------
>
>                 Key: HADOOP-15124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15124
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: common
>    Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
>            Reporter: Igor Dvorzhak
>            Assignee: Igor Dvorzhak
>              Labels: common, filesystem, statistics
>
> While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 
> workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time 
> is 5.58% and CPU time is 26.5% of total execution time.
> After switching FileSystem.Statistics implementation to LongAdder, consumed 
> Wall time decreased to 0.006% and CPU time to 0.104% of total execution time.
> Total job runtime decreased from 66 mins to 61 mins.
> These results are not conclusive, because I didn't benchmark multiple times 
> to average results, but regardless of performance gains switching to 
> LongAdder simplifies code and reduces its complexity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to