[
https://issues.apache.org/jira/browse/TEZ-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899932#comment-15899932
]
Jonathan Eagles commented on TEZ-3650:
--------------------------------------
[~rajesh.balamohan], [~sseth], this small performance jira removes a good
amount of overhead from fetch complete printing of the download rate and saves
about 50-300 millis per 25000 fetch completes. Considering some jobs fetch
10,000,000+ this can be a significant savings per reducer task. In addition the
FastNumberFormat utility is foundational for DAG scaling that was part of the
test of TEZ-1526. The was one other significant change worth explaining.
InputAttemptIdentifier#toString keeps showing up in the profiles. It was used
only for error related diagnostic printing, so changed the ShuffleUtils api to
delay to toString until absolutely necessary.
> Improve performance of FetchStatsLogger#logIndividualFetchComplete
> ------------------------------------------------------------------
>
> Key: TEZ-3650
> URL: https://issues.apache.org/jira/browse/TEZ-3650
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jonathan Eagles
> Assignee: Jonathan Eagles
> Attachments: TEZ-3650.1.patch, TEZ-3650.2.patch
>
>
> The cost of logging the fetch completed statement is dominated by two main
> factors 1) Formatting the download rate and 2) Minor String concatenation
> that isn't getting optimized.
> In this jira I propose a new Formatter that is optimized by implementing the
> StringBuilder#append(long) algorithm, but allows for formatting and reuse of
> StringBuilder.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)