[ 
https://issues.apache.org/jira/browse/DRILL-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014615#comment-17014615
 ] 

ASF GitHub Bot commented on DRILL-7509:
---------------------------------------

KazydubB commented on pull request #1954: DRILL-7509: Incorrect TupleSchema is 
created for DICT column when querying Parquet files
URL: https://github.com/apache/drill/pull/1954
 
 
   # [DRILL-7509](https://issues.apache.org/jira/browse/DRILL-7509): Incorrect 
TupleSchema is created for DICT column when querying Parquet files
   
   ## Description
   
   Removed nested `MapColumnMetadata` from `DictColumnMetadata`'s `schema` when 
querying `MAP` type from Parquet files (this nested column comes from the 
Parquet MAP structure). So the `DictColumnMetadata#schema` contains `key` and 
`value` fields directly.
   
   Added Parquet's `Type.Repetition` to `Metadata_V4` cache file to retain the 
actual `DataMode` of leaf primitives (previously this mode was computed based 
on max `repetition` and max `definition` levels, see javadoc @ 
`ParquetTableMetadataUtils#getDataMode(Type.Repetition)`).
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect TupleSchema is created for DICT column when querying Parquet files
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-7509
>                 URL: https://issues.apache.org/jira/browse/DRILL-7509
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.16.0
>            Reporter: Bohdan Kazydub
>            Assignee: Bohdan Kazydub
>            Priority: Major
>             Fix For: 1.18.0
>
>
> When {{DICT}} column is queried from Parquet file, its {{TupleSchema}} 
> contains nested element, e.g. `map`, itself contains `key` and `value` 
> fields, rather than containing the `key` and `value` fields in the {{DICT}}'s 
> {{TupleSchema}} itself. The nested element, `map`, comes from the inner 
> structure of Parquet's {{MAP}} (which corresponds to Drill's {{DICT}}) 
> representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to