[ 
https://issues.apache.org/jira/browse/TEZ-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2544:
----------------------------
    Description: 
Expected TaskSpec
{noformat}
DAGName : OrderedWordCount, VertexName: Summation, VertexParallelism: 1, 
TaskAttemptID:attempt_1433850314856_0019_1_01_000000_0, 
processorName=org.apache.tez.examples.OrderedWordCount$SumProcessor, 
inputSpecListSize=1, 
outputSpecListSize=1, inputSpecList=[{{ sourceVertexName=Tokenizer, 
physicalEdgeCount=2, 
inputClassName=org.apache.tez.runtime.library.input.OrderedGroupedKVInput }}, 
], outputSpecList=[{{ destinationVertexName=Sorter, physicalEdgeCount=1, 
outputClassName=org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput
 }}
{noformat}

The actual TaskSpec
{noformat}
DAGName : OrderedWordCount, VertexName: Summation, VertexParallelism: 1, 
TaskAttemptID:attempt_1433850314856_0019_1_01_000000_0, 
processorName=org.apache.tez.examples.OrderedWordCount$SumProcessor, 
inputSpecListSize=1, 
outputSpecListSize=1, inputSpecList=[{{ sourceVertexName=Tokenizer, 
physicalEdgeCount=1, 
inputClassName=org.apache.tez.runtime.library.input.OrderedGroupedKVInput }}, 
], outputSpecList=[{{ destinationVertexName=Sorter, physicalEdgeCount=1, 
outputClassName=org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput
 }}
{noformat}

The expected physicalEdgeCount is 2 but actually it is 1, it happens when 
dynamic parallelism estimation is enabled. 


  was:
Expected TaskSpec
{noformat}
DAGName : OrderedWordCount, VertexName: Summation, VertexParallelism: 1, 
TaskAttemptID:attempt_1433850314856_0019_1_01_000000_0, 
processorName=org.apache.tez.examples.OrderedWordCount$SumProcessor, 
inputSpecListSize=1, outputSpecListSize=1, inputSpecList=[{{ 
sourceVertexName=Tokenizer, physicalEdgeCount=2, 
inputClassName=org.apache.tez.runtime.library.input.OrderedGroupedKVInput }}, 
], outputSpecList=[{{ destinationVertexName=Sorter, physicalEdgeCount=1, 
outputClassName=org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput
 }}
{noformat}

The actual TaskSpec
{noformat}
DAGName : OrderedWordCount, VertexName: Summation, VertexParallelism: 1, 
TaskAttemptID:attempt_1433850314856_0019_1_01_000000_0, 
processorName=org.apache.tez.examples.OrderedWordCount$SumProcessor, 
inputSpecListSize=1, outputSpecListSize=1, inputSpecList=[{{ 
sourceVertexName=Tokenizer, physicalEdgeCount=1, 
inputClassName=org.apache.tez.runtime.library.input.OrderedGroupedKVInput }}, 
], outputSpecList=[{{ destinationVertexName=Sorter, physicalEdgeCount=1, 
outputClassName=org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput
 }}
{noformat}

The expected physicalEdgeCount is 2 but actually it is 1, it happens when 
dynamic parallelism estimation is enabled. 



> Incorrect dag result due to wrong TaskSpec in recovering
> --------------------------------------------------------
>
>                 Key: TEZ-2544
>                 URL: https://issues.apache.org/jira/browse/TEZ-2544
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>              Labels: Recovery
>
> Expected TaskSpec
> {noformat}
> DAGName : OrderedWordCount, VertexName: Summation, VertexParallelism: 1, 
> TaskAttemptID:attempt_1433850314856_0019_1_01_000000_0, 
> processorName=org.apache.tez.examples.OrderedWordCount$SumProcessor, 
> inputSpecListSize=1, 
> outputSpecListSize=1, inputSpecList=[{{ sourceVertexName=Tokenizer, 
> physicalEdgeCount=2, 
> inputClassName=org.apache.tez.runtime.library.input.OrderedGroupedKVInput }}, 
> ], outputSpecList=[{{ destinationVertexName=Sorter, physicalEdgeCount=1, 
> outputClassName=org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput
>  }}
> {noformat}
> The actual TaskSpec
> {noformat}
> DAGName : OrderedWordCount, VertexName: Summation, VertexParallelism: 1, 
> TaskAttemptID:attempt_1433850314856_0019_1_01_000000_0, 
> processorName=org.apache.tez.examples.OrderedWordCount$SumProcessor, 
> inputSpecListSize=1, 
> outputSpecListSize=1, inputSpecList=[{{ sourceVertexName=Tokenizer, 
> physicalEdgeCount=1, 
> inputClassName=org.apache.tez.runtime.library.input.OrderedGroupedKVInput }}, 
> ], outputSpecList=[{{ destinationVertexName=Sorter, physicalEdgeCount=1, 
> outputClassName=org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput
>  }}
> {noformat}
> The expected physicalEdgeCount is 2 but actually it is 1, it happens when 
> dynamic parallelism estimation is enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to