[ 
https://issues.apache.org/jira/browse/PIG-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507768#comment-15507768
 ] 

Travis Woodruff commented on PIG-5033:
--------------------------------------

Here's the DAG plan. I've only started testing Tez today, so I'm not very 
familiar with how these should look, but the fact that both the joins use 
scope-61 seems a bit suspicious.

{code}
Tez vertex scope-55     ->      Tez vertex scope-57,
Tez vertex scope-56     ->      Tez vertex scope-57,
Tez vertex scope-61     ->      Tez vertex scope-57,
Tez vertex scope-57

Tez vertex scope-55
# Plan on vertex
POValueOutputTez - scope-59     ->       [scope-57]
|
|---a: New For Each(false,false)[bag] - scope-7
    |   |
    |   Cast[int] - scope-2
    |   |
    |   |---Project[bytearray][0] - scope-1
    |   |
    |   Cast[int] - scope-5
    |   |
    |   |---Project[bytearray][1] - scope-4
    |
    |---a: Load(file:///tmp/input1:org.apache.pig.builtin.PigStorage) - scope-0
Tez vertex scope-56
# Plan on vertex
POValueOutputTez - scope-60     ->       [scope-57]
|
|---b: New For Each(false,false)[bag] - scope-15
    |   |
    |   Cast[int] - scope-10
    |   |
    |   |---Project[bytearray][0] - scope-9
    |   |
    |   Cast[int] - scope-13
    |   |
    |   |---Project[bytearray][1] - scope-12
    |
    |---b: Load(file:///tmp/input2:org.apache.pig.builtin.PigStorage) - scope-8
Tez vertex scope-61
# Plan on vertex
c: Split - scope-67
|   |
|   Local Rearrange[tuple]{int}(false) - scope-37       ->       scope-57
|   |   |
|   |   Project[int][0] - scope-33
|   |
|   |---e: Filter[bag] - scope-28
|       |   |
|       |   Greater Than[boolean] - scope-31
|       |   |
|       |   |---Project[int][1] - scope-29
|       |   |
|       |   |---Constant(3) - scope-30
|   |
|   Local Rearrange[tuple]{int}(false) - scope-51       ->       scope-57
|   |   |
|   |   Project[int][0] - scope-47
|   |
|   |---f: Filter[bag] - scope-42
|       |   |
|       |   Less Than[boolean] - scope-45
|       |   |
|       |   |---Project[int][1] - scope-43
|       |   |
|       |   |---Constant(2) - scope-44
|
|---c: New For Each(false,false)[bag] - scope-24
    |   |
    |   Cast[int] - scope-19
    |   |
    |   |---Project[bytearray][0] - scope-18
    |   |
    |   Cast[int] - scope-22
    |   |
    |   |---Project[bytearray][1] - scope-21
    |
    |---c: Load(file:///tmp/input3:org.apache.pig.builtin.PigStorage) - scope-17
Tez vertex scope-57
# Plan on vertex
h: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-54
|
|---h: FRJoin[tuple] - scope-48 <-       scope-61
    |   |
    |   Project[int][0] - scope-46
    |   |
    |   Project[int][0] - scope-47
    |
    |---g: FRJoin[tuple] - scope-34     <-       scope-61
        |   |
        |   Project[int][0] - scope-32
        |   |
        |   Project[int][0] - scope-33
        |
        |---POShuffledValueInputTez - scope-58  <-       [scope-55, scope-56]
{code}

> MultiQueryOptimizerTez creates bad plan with union, split and FRJoin
> --------------------------------------------------------------------
>
>                 Key: PIG-5033
>                 URL: https://issues.apache.org/jira/browse/PIG-5033
>             Project: Pig
>          Issue Type: Bug
>          Components: tez
>    Affects Versions: 0.16.0
>            Reporter: Travis Woodruff
>
> This script produces incorrect results:
> {code}
> a = load 'file:///tmp/input1' as (x:int, y:int);
> b = load 'file:///tmp/input2' as (x:int, y:int);
> u = union a,b;
> c = load 'file:///tmp/input3' as (x:int, y:int);
> e = filter c by y > 3;
> f = filter c by y < 2;
> g = join u by x left, e by x using 'replicated';
> h = join g by u::x left, f by x using 'replicated';
> store h into 'file:///tmp/pigoutput';
> {code}
> Without the union, or with opt.multiquery=false, or with non-replicated 
> joins, it works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to