[ 
https://issues.apache.org/jira/browse/PIG-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827657#comment-15827657
 ] 

Daniel Dai commented on PIG-5083:
---------------------------------

I see. CombinerPackager.attachInput prevent materializing bag on the combiner. 
TezReadOnceBag prevent materializing bag on the vertex input if the vertex has 
combine or order by. Seems streaming the last input in join case will use 
TezReadOnceBag as well in the follow up fix.

+1 for commit.

> CombinerPackager and LitePackager should not materialize bags
> -------------------------------------------------------------
>
>                 Key: PIG-5083
>                 URL: https://issues.apache.org/jira/browse/PIG-5083
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.17.0
>
>         Attachments: PIG-5083-1.patch
>
>
> Before PIG-3591 and creation of CombinerPackager, POCombinerPackage directly 
> read from the combiner/reducer input instead of materializing the bag.
> https://github.com/apache/pig/blob/branch-0.12/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java#L140-L161
> The unnecessary materialization leads to lot of spills and OOMs in some cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to