[ 
https://issues.apache.org/jira/browse/PIG-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1875:
--------------------------------

    Fix Version/s: 0.10

> Keep tuples serialized to limit spilling and speed it when it happens
> ---------------------------------------------------------------------
>
>                 Key: PIG-1875
>                 URL: https://issues.apache.org/jira/browse/PIG-1875
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Alan Gates
>            Priority: Minor
>             Fix For: 0.10
>
>         Attachments: mrtuple.patch
>
>
> Currently Pig reads records off of the reduce iterator and immediately 
> deserializes them into Java objects.  This takes up much more memory than 
> serialized versions, thus Pig spills sooner then if it stored them in 
> serialized form.  Also, if it does have to spill, it has to serialize them 
> again, and then again deserialize them after reading from the spill file.
> We should explore storing them in memory serialized when they are read off of 
> the reduce iterator.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to