[ 
https://issues.apache.org/jira/browse/PIG-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3748:
------------------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

 https://reviews.apache.org/r/17824/

When multiquery is off, POSplit is removed from the vertex and the tuple 
generated before POSplit is written once to each output and the plan after 
POSplit is executed in the subsequent vertex instead of being executed as a 
sub-plan of POSplit in the same vertex(multiquery on scenario). 

Optimizations TODO:
1) Once Tez supports shared edge, write once in POValueOutTez and let it be 
available for all the downstream vertices consuming the data.
2) POValueOutTez write key,value now with the key being empty. Weite a Input 
Output in Tez which only supports values and avoid writing empty keys.

Manually tested. Will fix older unit tests or add new ones later in PIG-3750.

Also Updated TezC7.gld file which was failing because of some earlier jira 
changes.

Got +1 from Daniel in reviewboard.

Committed to tez branch. Thanks Daniel for the review.

> Support for multiquery off in Tez
> ---------------------------------
>
>                 Key: PIG-3748
>                 URL: https://issues.apache.org/jira/browse/PIG-3748
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: tez-branch
>
>         Attachments: PIG-3748-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to