GitHub user beltran opened a pull request:
https://github.com/apache/tez/pull/23
TEZ-3958: Add internal vertex priority information into the tez dag.dâ¦
â¦ot debug information
This PR does the following:
* Move `generateDAGVizFile` from `DagAppMaster` to `Utils` so it can be
called from `DAGImpl`. The priorities are initialized after
`DAGImpl.initializeDAG`, that why this call is done there.
* There's still a call to `generateDAGVizFile` from `DagAppMaster` which
would render the file but without the priorities. This file will be overwritten
by the call in `DAGImpl`.
* Creates methods `getPriorityLowLimit` and `getPriorityLowLimit` in
`DAGScheduler`.
* Creates `getDAGScheduler` in `DAG`.
A sample `.dot` file looks like:
```
digraph DAG_Iteration_0 {
graph [ label="DAG_Iteration_0", fontsize=24, fontname=Helvetica];
node [fontsize=12, fontname=Helvetica];
edge [fontsize=9, fontcolor=blue, fontname=Arial];
"DAG_Iteration_0.Sorter_Output" [ label = "Sorter[Output]", shape = "box" ,
color= "black"];
"DAG_Iteration_0.Tokenizer" [ label =
"Tokenizer[WordCount$TokenProcessor,\n priority=8,\n ]" , color= "black" ];
"DAG_Iteration_0.Tokenizer" -> "DAG_Iteration_0.Summation" [ label =
"[input=OrderedPartitionedKVOutput,\n output=OrderedGroupedKVInput,\n
dataMovement=SCATTER_GATHER,\n schedulingType=SEQUENTIAL]" ];
"DAG_Iteration_0.Tokenizer_Input" [ label = "Tokenizer[Input]", shape =
"box" , color= "black"];
"DAG_Iteration_0.Tokenizer_Input" -> "DAG_Iteration_0.Tokenizer" [ label =
"Input [inputClass=MRInput,\n initializer=MRInputAMSplitGenerator]" ];
"DAG_Iteration_0.Summation" [ label =
"Summation[OrderedWordCount$SumProcessor,\n priority=11,\n ]" , color= "black"
];
"DAG_Iteration_0.Summation" -> "DAG_Iteration_0.Sorter" [ label =
"[input=OrderedPartitionedKVOutput,\n output=OrderedGroupedKVInput,\n
dataMovement=SCATTER_GATHER,\n schedulingType=SEQUENTIAL]" ];
"DAG_Iteration_0.Sorter" [ label = "Sorter[OrderedWordCount$NoOpSorter,\n
priority=14,\n ]" , color= "black" ];
"DAG_Iteration_0.Sorter" -> "DAG_Iteration_0.Sorter_Output" [ label =
"Output [outputClass=MROutput,\n committer=MROutputCommitter]" ];
}
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/beltran/tez TEZ-3958
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tez/pull/23.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23
----
commit daabde158fc58837a7bed059a79afaabc50e736c
Author: Jaume Marhuenda <jaumemarhuenda@...>
Date: 2018-06-27T20:55:38Z
TEZ-3958: Add internal vertex priority information into the tez dag.dot
debug information
----
---