[ 
https://issues.apache.org/jira/browse/TEZ-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899932#comment-15899932
 ] 

Jonathan Eagles commented on TEZ-3650:
--------------------------------------

[~rajesh.balamohan], [~sseth], this small performance jira removes a good 
amount of overhead from fetch complete printing of the download rate and saves 
about 50-300 millis per 25000 fetch completes. Considering some jobs fetch 
10,000,000+ this can be a significant savings per reducer task. In addition the 
FastNumberFormat utility is foundational for DAG scaling that was part of the 
test of TEZ-1526. The was one other significant change worth explaining. 
InputAttemptIdentifier#toString keeps showing up in the profiles. It was used 
only for error related diagnostic printing,  so changed the ShuffleUtils api to 
delay to toString until absolutely necessary.

> Improve performance of FetchStatsLogger#logIndividualFetchComplete
> ------------------------------------------------------------------
>
>                 Key: TEZ-3650
>                 URL: https://issues.apache.org/jira/browse/TEZ-3650
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3650.1.patch, TEZ-3650.2.patch
>
>
> The cost of logging the fetch completed statement is dominated by two main 
> factors 1) Formatting the download rate and 2) Minor String concatenation 
> that isn't getting optimized.
> In this jira I propose a new Formatter that is optimized by implementing the 
> StringBuilder#append(long) algorithm, but allows for formatting and reuse of 
> StringBuilder.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to