[ 
https://issues.apache.org/jira/browse/DRILL-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3209:
-----------------------------------
    Description: All reads against Hive are currently done through the Hive 
Serde interface, while this provides the most flexibility the API is not 
optimized for maximum performance while reading the data into Drill's native 
data structures. For Parquet and Text file backed tables, we can plan these 
reads as Drill native reads. Currently reads of these file types provide 
untyped data. While parquet has metadata in the file we currently do not make 
use of the type information while planning. For text files we read all of the 
files as lists of varchars. In both of these cases, casts will need to be 
injected to provide the same datatypes provided by the reads through the SerDe 
interface.

> [Umbrella] Plan reads of Hive tables as native Drill reads when a native 
> reader for the underlying table format exists
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3209
>                 URL: https://issues.apache.org/jira/browse/DRILL-3209
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization, Storage - Hive
>            Reporter: Jason Altekruse
>            Assignee: Jason Altekruse
>
> All reads against Hive are currently done through the Hive Serde interface, 
> while this provides the most flexibility the API is not optimized for maximum 
> performance while reading the data into Drill's native data structures. For 
> Parquet and Text file backed tables, we can plan these reads as Drill native 
> reads. Currently reads of these file types provide untyped data. While 
> parquet has metadata in the file we currently do not make use of the type 
> information while planning. For text files we read all of the files as lists 
> of varchars. In both of these cases, casts will need to be injected to 
> provide the same datatypes provided by the reads through the SerDe interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to