[
https://issues.apache.org/jira/browse/TEZ-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595254#comment-14595254
]
Rajesh Balamohan commented on TEZ-2565:
---------------------------------------
>> Nit: "final Set<TezTaskID> taskSet;" - this could be a bitset
- Fixed
>> "ioStats.putIfAbsent(name, new IOStatisticsImpl());" - Not a concurrentMap.
>> A put should be sufficient ? There's other issues if we expect concurrent
>> invocation of this method.
- Fixed. put() is sufficient
>> if (currentStatistics.containsTask(t.getTaskId())) { break;}. Should be a
>> continue so that it moves forward and aggregates statistics for newly
>> completed tasks.
- Not an issue as it would break out of the switch case. But changed this logic
in the recent patch.
>> If we're only going to aggregate statistics when a task completes, this
>> should be called out on the API. In theory, it's possible to report stats as
>> a task makes progress. It would make sense to do this if progress was also
>> available. Since we're not doing that at the moment - calling it out on the
>> API javadocs will be useful.
- In the current patch, it would address both running & completed stats.
Accumulating completed task stats in completedTasksStatsCache. This will ensure
that the completed task stats are not recomputed often. For running task stats,
method specific stats is created which gets merged with completed stats (if
present). This ensures that higher level users would be able to access both
running & completed task stats without incurring much overhead. In case a task
gets rescheduled from SUCCEEDED --> SCHEDULED (in case of failures), completed
stats is cleared to provide consistent view.
>> The change isn't thread safe if getStatistics can be called concurrently -
>> since it's modifying stats under a read lock. I don't think that's possible
>> right now since it's invoked from the VertexManager or when the Vertex
>> completes under a write lock. This would need it's own lock while
>> constructing stats if concurrent calls are possible - for example in the
>> future if the Edge is allowed to access stats, or stats are pushed to
>> consumers. This should ideally be documented.
- Fixed. This can be called in thread safe manner.
[~bikassaha]'s comments on review board (consolidating here)
>> We are missing stats for running tasks. The previous impl would include
>> stats for all tasks including running stats. Currently, stats updates are
>> coming after completion but that may not be the case always. Perhaps we can
>> introduce a filter enum, that allows asking for stats for completed/all
>> tasks. That will allow frequent calls for completed tasks but infrequent
>> calls for all tasks.
- Fixed. Using RUNNING/SUCCEEDED task statistics with EnumSet internal to
VertexImpl. However, not exposing taskState (internal) to
VertexManagerPluginContext which is public facing.
- Also, clearing the completedTaskStats based on TaskRescheduledTransition.
> Consider scanning unfinished tasks in VertexImpl::constructStatistics to
> reduce merge overhead
> ----------------------------------------------------------------------------------------------
>
> Key: TEZ-2565
> URL: https://issues.apache.org/jira/browse/TEZ-2565
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Attachments: TEZ-2565.1.patch, TEZ-2565.2.patch
>
>
> constructStatistics() can be an overhead (scanning all tasks and merging
> stats) depending on the number of invocations to Vertex::getStatistics().
> Consider scanning only unfinished tasks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)