[ 
https://issues.apache.org/jira/browse/TEZ-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597182#comment-14597182
 ] 

Rajesh Balamohan commented on TEZ-2565:
---------------------------------------

- clear()/setupIOStats() can be removed and tests can be modified.
- If we discard the old object, won't there be a condition where "VM asks for 
statistics and in the middle of executing it.  And due to some failure, the 
stats cache gets recreated as a part of TaskRescheduledTransition." .  In such 
cases, won't be VM see inconsistent result? synchronized was added to eliminate 
this issue.  Won't it be a corner case?
- This came up as expensive method when creating the patch for providing 
partition level statistics (as providing partition stats & merging them adds 
additional information to the mergeFrom() which can be expensive when invoked 
too many number of times).  Without partition stats, currently it has only 2 
setters in IOStatistics which wouldn't show up as high CPU in existing code.

> Consider scanning unfinished tasks in VertexImpl::constructStatistics to 
> reduce merge overhead
> ----------------------------------------------------------------------------------------------
>
>                 Key: TEZ-2565
>                 URL: https://issues.apache.org/jira/browse/TEZ-2565
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2565.1.patch, TEZ-2565.2.patch
>
>
> constructStatistics() can be an overhead (scanning all tasks and merging 
> stats) depending on the number of invocations to Vertex::getStatistics().  
> Consider scanning only unfinished tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to