[jira] [Assigned] (SPARK-31064) New Parquet Predicate Filter APIs with multi-part Identifier Support

DB Tsai (Jira) Fri, 06 Mar 2020 19:09:13 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-31064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


DB Tsai reassigned SPARK-31064:
-------------------------------

    Assignee: DB Tsai

> New Parquet Predicate Filter APIs with multi-part Identifier Support
> --------------------------------------------------------------------
>
>                 Key: SPARK-31064
>                 URL: https://issues.apache.org/jira/browse/SPARK-31064
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.4.5
>            Reporter: DB Tsai
>            Assignee: DB Tsai
>            Priority: Major
>             Fix For: 3.0.0
>
>
> Parquet's *org.apache.parquet.filter2.predicate.FilterApi* uses *dots* as 
> separators to split the column name into multi-parts of nested fields. The 
> drawback is this causes issues when the field name contains *dot*.
> The new APIs that will be added will take array of string directly for 
> multi-parts of nested fields, so no confusion as using *dot* as a separator.
> It's intended to move this code back to parquet community. See [PARQUET-1809]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Assigned] (SPARK-31064) New Parquet Predicate Filter APIs with multi-part Identifier Support

Reply via email to