[ 
https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299879#comment-16299879
 ] 

Steve Loughran commented on HADOOP-15124:
-----------------------------------------

Making per-thread stats optional would seem a good idea. For multitenant 
programs per-thread numbers are good, you can say "this query read X bytes, 
wrote Y bytes and experienced Z throttle events from your storage infra". For 
single tenant programs, you just want to keep an eye on the aggregate values. 
But it is more than just troubleshooting, it's assessing the work performed 
across a cluster, if aggregated properly.

Don't want to do special hadoop threads as then you can't do the stat 
collection in downstream apps which have their own thread pools. (spark, flink)


> Slow FileSystem.Statistics counters implementation
> --------------------------------------------------
>
>                 Key: HADOOP-15124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15124
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: common
>    Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
>            Reporter: Igor Dvorzhak
>            Assignee: Igor Dvorzhak
>              Labels: common, filesystem, statistics
>
> While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 
> workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time 
> is 5.58% and CPU time is 26.5% of total execution time.
> After switching FileSystem.Statistics implementation to LongAdder, consumed 
> Wall time decreased to 0.006% and CPU time to 0.104% of total execution time.
> Total job runtime decreased from 66 mins to 61 mins.
> These results are not conclusive, because I didn't benchmark multiple times 
> to average results, but regardless of performance gains switching to 
> LongAdder simplifies code and reduces its complexity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to