[
https://issues.apache.org/jira/browse/DRILL-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402147#comment-16402147
]
Arina Ielchiieva edited comment on DRILL-6259 at 3/16/18 4:32 PM:
------------------------------------------------------------------
Support for complex types but not for scalar complex types.
Example of supported types in parquet schema:
{noformat}
message complex_users {
required group user {
required int32 id;
optional int32 age;
repeated int32 hobby_ids;
optional boolean active;
}
}
{noformat}
This is simple one, it can be nested as well.
was (Author: arina):
Support for complex types but not for scalar complex type.
Example of supported types in parquet schema:
{noformat}
message complex_users {
required group user {
required int32 id;
optional int32 age;
repeated int32 hobby_ids;
optional boolean active;
}
}
{noformat}
This is simple one, it can be nested as well.
> Support parquet filter push down for complex types
> --------------------------------------------------
>
> Key: DRILL-6259
> URL: https://issues.apache.org/jira/browse/DRILL-6259
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.13.0
> Reporter: Arina Ielchiieva
> Assignee: Arina Ielchiieva
> Priority: Major
> Fix For: 1.14.0
>
>
> Currently parquet filter push down is not working for complex types
> (including arrays).
> This Jira aims to implement filter push down for complex types which
> underneath type is among supported simple types for filter push down. For
> instance, currently Drill does not support filter push down for varchars,
> decimals etc. Though once Drill will start support, this support will be
> applied for complex type automatically.
> Complex fields will be pushed down the same way regular fields are, except
> for one case with arrays.
> Query with predicate {{where users.hobbies_ids[2] is null}} won't be able to
> push down because we are not able to determine exact number of nulls in
> arrays fields.
> {{Consider [1, 2, 3]}} vs {{[1, 2]}} if these arrays are in different files.
> Statistics for the second case won't show any nulls but when querying from
> two files, in terms of data the third value in array is null.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)