[ https://issues.apache.org/jira/browse/PARQUET-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276761#comment-14276761 ]
Harsh J commented on PARQUET-110: --------------------------------- This should probably be marked resolved if PIG-4219 resolves it. > Some schemas without column projection cause Pig failures > --------------------------------------------------------- > > Key: PARQUET-110 > URL: https://issues.apache.org/jira/browse/PARQUET-110 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Reporter: Ryan Blue > > Parquet stores and loads the Pig schema in the Configuration. Along the way, > Pig changes that Schema: > {code:java} > // This schema is converted from Parquet and written in Configuration > String schemaStr = "my_list: {array: (array_element: (num1: int,num2: int))}"; > // Reparsed using org.apache.pig.impl.util.Utils > Schema schema = Utils.getSchemaFromString(schemaStr); > // But no longer matches the original structure > schema.toString(); > // => {my_list: {array_element: (num1: int,num2: int)}} > {code} > Note that the intermediate bag, named either "bag" or "array", is removed > when Pig reparses the Schema. I can work around this to an extent in the > Parquet code, but the Pig behavior gets more strange. If there are two of > these, the second is preserved but renamed to "bag_0". Something funny is > going on there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)