[ 
https://issues.apache.org/jira/browse/DRILL-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Guzenko updated DRILL-7268:
--------------------------------
    Description: 
When Hive stores array data in parquet format, it creates schema for such 
columns, like: 
 arr_n_0 ARRAY<INT>
{code:java}
 optional group arr_n_0 (LIST) {
 repeated group bag {
 optional int32 array_element;
 }
 }
{code}
Sample result before the changes was:
{code:java}
{"bag":[{"array_element":1},\{"array_element":2}]}
{code}
After the changes Drill reads only array elements data without additional keys 
like "bag" or "array_element": 

{code}[1,2] \{code} . 

 

Please read Design Doc linked to parent task for more details. 

  was:
When Hive stores array data in parquet format, it creates schema for such 
columns, like: 
arr_n_0 ARRAY<INT>

{code}
 optional group arr_n_0 (LIST) {
 repeated group bag {
 optional int32 array_element;
 }
 }
{code}

Sample result before the changes was:

{code}\{"bag":[{"array_element":1},\{"array_element":2}]} \{code}

After the changes Drill reads only array elements data without additional keys 
like "bag" or "array_element":

{code} [1,2] \{code} . 

 

Please read Design Doc linked to parent task for more details. 


> Read Hive array with parquet native reader
> ------------------------------------------
>
>                 Key: DRILL-7268
>                 URL: https://issues.apache.org/jira/browse/DRILL-7268
>             Project: Apache Drill
>          Issue Type: Sub-task
>            Reporter: Igor Guzenko
>            Assignee: Igor Guzenko
>            Priority: Major
>              Labels: ready-to-commit
>             Fix For: 1.17.0
>
>
> When Hive stores array data in parquet format, it creates schema for such 
> columns, like: 
>  arr_n_0 ARRAY<INT>
> {code:java}
>  optional group arr_n_0 (LIST) {
>  repeated group bag {
>  optional int32 array_element;
>  }
>  }
> {code}
> Sample result before the changes was:
> {code:java}
> {"bag":[{"array_element":1},\{"array_element":2}]}
> {code}
> After the changes Drill reads only array elements data without additional 
> keys like "bag" or "array_element": 
> {code}[1,2] \{code} . 
>  
> Please read Design Doc linked to parent task for more details. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to