[ 
https://issues.apache.org/jira/browse/PIG-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393142#comment-14393142
 ] 

Rohini Palaniswamy commented on PIG-4495:
-----------------------------------------

[~daijy],
    Would it be ok to do this in MultiQueryOptimizer itself by checking if 
union optimizer is turned on and the successor vertex is union or we should 
write another optimizer after UnionOptimizer to do it? It is more easy to do in 
MultiQueryOptimizer and would be less error prone. 

> Better multi-query planning in case of union and multiple edges
> ---------------------------------------------------------------
>
>                 Key: PIG-4495
>                 URL: https://issues.apache.org/jira/browse/PIG-4495
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>             Fix For: 0.15.0
>
>
> Details in 
> https://issues.apache.org/jira/browse/TEZ-1190?focusedCommentId=14393033&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14393033
> People split the data, perform some foreach transformations/filter, union 
> them and then do some operation like group by or join with other data. In 
> those cases it creates multiple edges from same Split, so we do not merge 
> them, but  
> write out the data to another dummy vertex to avoid multiple edges and this 
> adds overhead and affects performance. Vertex groups accept multiple edges 
> from same vertex. So if the multiple edges end up in a vertex group (and not 
> a vertex which is the case in self join) we can avoid the dummy vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to