[
https://issues.apache.org/jira/browse/TEZ-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597182#comment-14597182
]
Rajesh Balamohan commented on TEZ-2565:
---------------------------------------
- clear()/setupIOStats() can be removed and tests can be modified.
- If we discard the old object, won't there be a condition where "VM asks for
statistics and in the middle of executing it. And due to some failure, the
stats cache gets recreated as a part of TaskRescheduledTransition." . In such
cases, won't be VM see inconsistent result? synchronized was added to eliminate
this issue. Won't it be a corner case?
- This came up as expensive method when creating the patch for providing
partition level statistics (as providing partition stats & merging them adds
additional information to the mergeFrom() which can be expensive when invoked
too many number of times). Without partition stats, currently it has only 2
setters in IOStatistics which wouldn't show up as high CPU in existing code.
> Consider scanning unfinished tasks in VertexImpl::constructStatistics to
> reduce merge overhead
> ----------------------------------------------------------------------------------------------
>
> Key: TEZ-2565
> URL: https://issues.apache.org/jira/browse/TEZ-2565
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-2565.1.patch, TEZ-2565.2.patch
>
>
> constructStatistics() can be an overhead (scanning all tasks and merging
> stats) depending on the number of invocations to Vertex::getStatistics().
> Consider scanning only unfinished tasks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)