[ 
https://issues.apache.org/jira/browse/PIG-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507776#comment-15507776
 ] 

Travis Woodruff commented on PIG-5033:
--------------------------------------

And here's the plan when I remove the union (which gives correct results). The 
difference seems to be that it leaves one of the join right-side inputs in a 
separate vertex.

{code}
Tez vertex scope-47     ->      Tez vertex scope-46,Tez vertex scope-51,
Tez vertex scope-51     ->      Tez vertex scope-46,
Tez vertex scope-46

Tez vertex scope-47
# Plan on vertex
c: Split - scope-53
|   |
|   Local Rearrange[tuple]{int}(false) - scope-28       ->       scope-46
|   |   |
|   |   Project[int][0] - scope-24
|   |
|   |---e: Filter[bag] - scope-19
|       |   |
|       |   Greater Than[boolean] - scope-22
|       |   |
|       |   |---Project[int][1] - scope-20
|       |   |
|       |   |---Constant(3) - scope-21
|   |
|   POValueOutputTez - scope-48 ->       [scope-51]
|
|---c: New For Each(false,false)[bag] - scope-15
    |   |
    |   Cast[int] - scope-10
    |   |
    |   |---Project[bytearray][0] - scope-9
    |   |
    |   Cast[int] - scope-13
    |   |
    |   |---Project[bytearray][1] - scope-12
    |
    |---c: Load(file:///tmp/input3:org.apache.pig.builtin.PigStorage) - scope-8
Tez vertex scope-51
# Plan on vertex
Local Rearrange[tuple]{int}(false) - scope-42   ->       scope-46
|   |
|   Project[int][0] - scope-38
|
|---f: Filter[bag] - scope-33
    |   |
    |   Less Than[boolean] - scope-36
    |   |
    |   |---Project[int][1] - scope-34
    |   |
    |   |---Constant(2) - scope-35
    |
    |---POValueInputTez - scope-52      <-       scope-47
Tez vertex scope-46
# Plan on vertex
h: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-45
|
|---h: FRJoin[tuple] - scope-39 <-       scope-51
    |   |
    |   Project[int][0] - scope-37
    |   |
    |   Project[int][0] - scope-38
    |
    |---g: FRJoin[tuple] - scope-25     <-       scope-47
        |   |
        |   Project[int][0] - scope-23
        |   |
        |   Project[int][0] - scope-24
        |
        |---a: New For Each(false,false)[bag] - scope-7
            |   |
            |   Cast[int] - scope-2
            |   |
            |   |---Project[bytearray][0] - scope-1
            |   |
            |   Cast[int] - scope-5
            |   |
            |   |---Project[bytearray][1] - scope-4
            |
            |---a: Load(file:///tmp/input1:org.apache.pig.builtin.PigStorage) - 
scope-0
{code}

> MultiQueryOptimizerTez creates bad plan with union, split and FRJoin
> --------------------------------------------------------------------
>
>                 Key: PIG-5033
>                 URL: https://issues.apache.org/jira/browse/PIG-5033
>             Project: Pig
>          Issue Type: Bug
>          Components: tez
>    Affects Versions: 0.16.0
>            Reporter: Travis Woodruff
>
> This script produces incorrect results:
> {code}
> a = load 'file:///tmp/input1' as (x:int, y:int);
> b = load 'file:///tmp/input2' as (x:int, y:int);
> u = union a,b;
> c = load 'file:///tmp/input3' as (x:int, y:int);
> e = filter c by y > 3;
> f = filter c by y < 2;
> g = join u by x left, e by x using 'replicated';
> h = join g by u::x left, f by x using 'replicated';
> store h into 'file:///tmp/pigoutput';
> {code}
> Without the union, or with opt.multiquery=false, or with non-replicated 
> joins, it works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to