[
https://issues.apache.org/jira/browse/DRILL-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacques Nadeau updated DRILL-1858:
----------------------------------
Fix Version/s: Future
> Parquet reader should only explicitly fill in data for a column requested but
> not in the file if there are no valid columns found
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: DRILL-1858
> URL: https://issues.apache.org/jira/browse/DRILL-1858
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Jason Altekruse
> Fix For: Future
>
>
> If columns are requested from a parquet file, that do not appear in the
> particular file (users may have a directory full of files that share some
> columns but not others) then we do not need to create a vector to represent
> these columns in most cases. These columns can be materialized (as a vector
> filled with nulls) later when they are referenced in other parts of the
> query, such as a filter or join condition. The current behavior of the reader
> is to always fill vectors for these types of columns, but this just creates
> extra payload to ship around until the vectors are actually referenced.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)