[
https://issues.apache.org/jira/browse/TEZ-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16535461#comment-16535461
]
Eric Wohlstadter commented on TEZ-3958:
---------------------------------------
[~jmarhuen]
We should avoid introducing a new {{abstract}} method into {{DAGScheduler}}.
This class might be extended by outside applications which are using Tez, and
we don't want to break them. I would give {{getPriorityLowLimit}} a default
implementation. Something like {{throw new UnsupportedOperationException, and}}
then skip generation of priorities in the .dot file if this exception is thrown.
Also this logic:
{code:java}
final int vertexDistanceFromRoot = vertex.getDistanceFromRoot();
return ((vertexDistanceFromRoot + 1) * dag.getTotalVertices() * 3)
+ (vertex.getVertexId().getId() * 3);{code}
is repeated in two places in the patch, and also in both
{{DAGSchedulerNaturalOrder}} and {{DAGSchedulerNaturalOrderControlled}}. This
looks like it should be refactored into the base class, so that it appears one
time instead of four. Again we need to be careful here about not breaking
sub-classes of {{DAGScheduler}} which are not in the Tez project source code.
> Add internal vertex priority information into the tez dag.dot debug
> information
> -------------------------------------------------------------------------------
>
> Key: TEZ-3958
> URL: https://issues.apache.org/jira/browse/TEZ-3958
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Gopal V
> Assignee: Jaume M
> Priority: Major
> Attachments: TEZ-3958.1.patch, TEZ-3958.2.patch
>
>
> Adding the actual vertex priority as computed by Tez into the debug dag.dot
> file would allows the debugging of task pre-emption issues when the DAG is no
> longer a tree.
> There are pre-emption issues with isomerization of Tez DAGs, where the a
> R-isomer dag with mirror rotation runs at a different speed than the L-isomer
> dag, due to priorities at the same level changing due to the vertex-id order.
> Since the problem is hard to debug through, it would be good to record the
> computed priority in the DAG .dot file in the logging directories.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)