[ 
https://issues.apache.org/jira/browse/PIG-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates resolved PIG-10.
---------------------------

    Resolution: Invalid

There is no requirement in pig that each tuple in a relation share the same 
schema, so it will not always be an option to store the schema once up front in 
intermediate results.  Even in the cases where the schema is known, complex 
data types with no guaranteed schemas (such as maps) could be in the tuples and 
would still require markers in the code.  We could optimize for the case where 
all tuples are the same and all tuples contain only atomic data, but its not 
clear how we would know that to be the case.

> reduce encoding of intermediate results
> ---------------------------------------
>
>                 Key: PIG-10
>                 URL: https://issues.apache.org/jira/browse/PIG-10
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Olga Natkovich
>
> Currently, in intermediate results, the data is written with a marker for 
> every column in every row.  For instance if
> we are writing a row that has a schema of bag, atom, we'll write:
> BAGMARKER BAGDATA ATOMMARKER ATOMDATA
> There's no reason to write the markers for every row.  Is should be 
> sufficient to write it once at the beginning of the
> file and then remember it for subsequent rows.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to