[
https://issues.apache.org/jira/browse/TEZ-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561413#comment-14561413
]
Jonathan Eagles edited comment on TEZ-2491 at 5/27/15 7:23 PM:
---------------------------------------------------------------
[~rohini], this analysis seems in line with the analysis as part of TEZ-2485.
[~sseth], agree that we could push a dictionary of counters names earlier in
the hierarchy (perhaps at the vertex level and/or perhaps tez itself should
publish a master dictionary of tez built-in counters) and omit zero counters. I
think that we could consider compressing them, but this will remove the human
readable aspect that is still maintained. We could also only log a counter if
it differs from its parent's value. This will prevent the Task/Task Attempt
duplication for the case where there is only a single task attempt.
was (Author: jeagles):
[~rohini], this analysis seems in line with the analysis as part of TEZ-2485.
[~sseth], agree that we could push a dictionary of counters names earlier in
the hierarchy (perhaps at the vertex level and/or perhaps tez itself should
publish a master dictionary of tez built-in counters) and omit zero counters. I
think that we could consider compressing them, but this will remove the human
readable aspect that is still maintained.
> Optimize storage and exchange of Counters for better scaling
> ------------------------------------------------------------
>
> Key: TEZ-2491
> URL: https://issues.apache.org/jira/browse/TEZ-2491
> Project: Apache Tez
> Issue Type: Task
> Reporter: Rohini Palaniswamy
>
> Counters take up a lot of space in the task events generated and is a
> major bottleneck for scaling ATS. [~jlowe] found a lot of potential
> optimizations. Using this as an umbrella jira for documentation. Can create
> sub-tasks later after discussions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)