[
https://issues.apache.org/jira/browse/TEZ-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935848#comment-13935848
]
Siddharth Seth commented on TEZ-933:
------------------------------------
I don't think setting this early affects anything. -1 is set by users, and will
just be set earlier instead of during vertex initialization. Any component
which is setting parallelism will see parallelism set to -1, like it does today.
[~hitesh] pointed out a failed unit test. Attaching an addendum patch for that.
The unit test happened to be working earlier because it ended up validating
against a vertex from a different graph (for which parallelism was 0 since it
used to only get set during the init transition). The patch updates the unit
test to verify against a correct vertex.
> Race in getting source / destination numTasks on an Edge
> --------------------------------------------------------
>
> Key: TEZ-933
> URL: https://issues.apache.org/jira/browse/TEZ-933
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
> Fix For: 0.4.0
>
> Attachments: TEZ-933.1.txt, TEZ-933.addendum.txt
>
>
> Edges rely on getting properties (specifically numTasks in this case) from
> the source or destination vertex.
> This can end up with an incorrect value being used depending on the state of
> the vertex - whether the vertex has been initialized, whether the parallelism
> has been changed etc.
> As an example
> {code}
> edgeManager.getNumSourceTaskPhysicalOutputs(destinationVertex.getTotalTasks(),
> sourceTaskIndex))
> {code}
> destinationVertex.getTotalTasks() may be incorrect if the destinationVertex
> hasn't yet been initialized. Alternately, this value can change based on
> setParallelism calls.
--
This message was sent by Atlassian JIRA
(v6.2#6252)