[ 
https://issues.apache.org/jira/browse/PIG-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-686.
--------------------------------

    Resolution: Won't Fix

We have experimented with this work and the performance gains (at most 5-7%) 
are not sufficient for the complexity it would add to the code. Hopefully, once 
we integrate with AVRO, we get the improvement.

> PERFORMANCE: improve how data is stored between M-R jobs and between Map and 
> Reduce
> -----------------------------------------------------------------------------------
>
>                 Key: PIG-686
>                 URL: https://issues.apache.org/jira/browse/PIG-686
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.2.0
>            Reporter: Olga Natkovich
>
> Currently, there is quite a bit of overhead in how the data is serialized in 
> both cases because a type information is stored with each field.
> However, most of the time the data has known and consistent schema in which 
> case, it is sufficient to store the schema once. 
> This change could really decrease the ammount of intermediate data generated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to