[ 
https://issues.apache.org/jira/browse/PARQUET-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276761#comment-14276761
 ] 

Harsh J commented on PARQUET-110:
---------------------------------

This should probably be marked resolved if PIG-4219 resolves it.

> Some schemas without column projection cause Pig failures
> ---------------------------------------------------------
>
>                 Key: PARQUET-110
>                 URL: https://issues.apache.org/jira/browse/PARQUET-110
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>            Reporter: Ryan Blue
>
> Parquet stores and loads the Pig schema in the Configuration. Along the way, 
> Pig changes that Schema:
> {code:java}
> // This schema is converted from Parquet and written in Configuration
> String schemaStr = "my_list: {array: (array_element: (num1: int,num2: int))}";
> // Reparsed using org.apache.pig.impl.util.Utils
> Schema schema = Utils.getSchemaFromString(schemaStr);
> // But no longer matches the original structure
> schema.toString();
> // => {my_list: {array_element: (num1: int,num2: int)}}
> {code}
> Note that the intermediate bag, named either "bag" or "array", is removed 
> when Pig reparses the Schema. I can work around this to an extent in the 
> Parquet code, but the Pig behavior gets more strange. If there are two of 
> these, the second is preserved but renamed to "bag_0". Something funny is 
> going on there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to