[
https://issues.apache.org/jira/browse/TEZ-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906742#comment-15906742
]
Rajesh Balamohan commented on TEZ-3650:
---------------------------------------
LGTM. +1. Thanks [~jeagles].
Though InputAttemptIdentifier#toString would internally be using StringBuilder
(JVM byte code represents this as StringBuilder), it could still show up in
profiler due to concatenation/toString. I agree that this
InputAttemptIdentifier#toString could have been delayed as provided in current
the patch.
For reporting fetch rate (MB/s), we may need to consider nanoTime instead of
millis. That can be a separate JIRA. Pasting an example log here.
{noformat}
2017-03-12 19:24:52,917 [INFO] [Fetcher_B {Reducer_16} #0]
|ShuffleManager.fetch|: Completed fetch for attempt: {0, 0,
attempt_1488231257387_2078_1_10_000000_0_10009} to MEMORY, csize=10884,
dsize=10867, EndTime=1489361092917, TimeTaken=1, Rate=10.37 MB/s
2017-03-12 19:25:09,833 [INFO] [Fetcher_B {Reducer_16} #0]
|ShuffleManager.fetch|: Completed fetch for attempt: {0, 0,
attempt_1488231257387_2078_1_10_000000_0_10009} to MEMORY, csize=10884,
dsize=10867, EndTime=1489361109833, TimeTaken=0, Rate=0.00 MB/s
{noformat}
Though the second log statement fetched the data a lot faster (both have
csize=10884), it is reported as 0.00 MB/s in the second log statement as it is
in millisecond. Had it been nanoTime, we would get better accuracy.
> Improve performance of FetchStatsLogger#logIndividualFetchComplete
> ------------------------------------------------------------------
>
> Key: TEZ-3650
> URL: https://issues.apache.org/jira/browse/TEZ-3650
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jonathan Eagles
> Assignee: Jonathan Eagles
> Attachments: TEZ-3650.1.patch, TEZ-3650.2.patch
>
>
> The cost of logging the fetch completed statement is dominated by two main
> factors 1) Formatting the download rate and 2) Minor String concatenation
> that isn't getting optimized.
> In this jira I propose a new Formatter that is optimized by implementing the
> StringBuilder#append(long) algorithm, but allows for formatting and reuse of
> StringBuilder.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)