[jira] [Comment Edited] (HADOOP-15124) Slow FileSystem.Statistics counters implementation

Igor Dvorzhak (JIRA) Fri, 22 Dec 2017 11:49:22 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301879#comment-16301879
 ]


Igor Dvorzhak edited comment on HADOOP-15124 at 12/22/17 7:48 PM:
------------------------------------------------------------------

I have made additional performance profiling with AtomicLong backend instead of 
LongAdder, and realize that for GCS it has very minor performance penalty, but 
for HDFS it even improves performance (looks like HDFS reads Statistics more 
frequently than GCS doing so, and LongAdder reads are slower than AtomicLong 
reads).

Taking above into account I have modified my PR to use AtomicLong and now we 
can patch all Hadoop releases starting from 2.7.

Here is results:
|| || ||TeraGen 1TB|| || ||
|| ||GCS|| ||HDFS|| ||
| |LongAdder|AtomicLong|LongAdder|AtomicLong|
|CPU, ms|3.86|3.88|172.6|147.11|
|CPU, %|0.057|0.06|2.29|1.89|
|Wall, ms|3.79|4.31|242.83|185.1|
|Wall, %|0.002|0.003|0.146|0.111|
Legend:
* "LongAdder" - FS.Statistics with LongAdder backend and with per-thread 
statistic disabled
* "AtomicLong" - FS.Statistics with AtomicLong backend and with per-thread 
statistic disabled


was (Author: medb):
I have made additional performance profiling with AtomicLong backend instead of 
LongAdder, and realize that for GCS it has very minor performance penalty, but 
for HDFS it even improves performance (looks like HDFS reads Statistics more 
frequently than GCS doing so, and LongAdder reads are slower than AtomicLong 
reads).

Taking this into account I have modified my PR to use AtomicLong and now we can 
patch all Hadoop releases starting from 2.7.

Here is results:
|| || ||TeraGen 1TB|| || ||
|| ||GCS|| ||HDFS|| ||
| |LongAdder|AtomicLong|LongAdder|AtomicLong|
|CPU, ms|3.86|3.88|172.6|147.11|
|CPU, %|0.057|0.06|2.29|1.89|
|Wall, ms|3.79|4.31|242.83|185.1|
|Wall, %|0.002|0.003|0.146|0.111|
Legend:
* "LongAdder" - FS.Statistics with LongAdder backend and with per-thread 
statistic disabled
* "AtomicLong" - FS.Statistics with AtomicLong backend and with per-thread 
statistic disabled

> Slow FileSystem.Statistics counters implementation
> --------------------------------------------------
>
>                 Key: HADOOP-15124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15124
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: common
>    Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
>            Reporter: Igor Dvorzhak
>            Assignee: Igor Dvorzhak
>              Labels: common, filesystem, statistics
>
> While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 
> workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time 
> is 5.58% and CPU time is 26.5% of total execution time.
> After switching FileSystem.Statistics implementation to LongAdder, consumed 
> Wall time decreased to 0.006% and CPU time to 0.104% of total execution time.
> Total job runtime decreased from 66 mins to 61 mins.
> These results are not conclusive, because I didn't benchmark multiple times 
> to average results, but regardless of performance gains switching to 
> LongAdder simplifies code and reduces its complexity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HADOOP-15124) Slow FileSystem.Statistics counters implementation

Reply via email to