[
https://issues.apache.org/jira/browse/SPARK-24130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645565#comment-16645565
]
Xiao Li commented on SPARK-24130:
---------------------------------
Any data source migration work is being blocked by
https://github.com/apache/spark/pull/22547
> Data Source V2: Join Push Down
> ------------------------------
>
> Key: SPARK-24130
> URL: https://issues.apache.org/jira/browse/SPARK-24130
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 2.3.0
> Reporter: Jia Li
> Priority: Major
> Attachments: Data Source V2 Join Push Down.pdf
>
>
> Spark applications often directly query external data sources such as
> relational databases, or files. Spark provides Data Sources APIs for
> accessing structured data through Spark SQL. Data Sources APIs in both V1 and
> V2 support optimizations such as Filter push down and Column pruning which
> are subset of the functionality that can be pushed down to some data sources.
> We’re proposing to extend Data Sources APIs with join push down (JPD). Join
> push down significantly improves query performance by reducing the amount of
> data transfer and exploiting the capabilities of the data sources such as
> index access.
> Join push down design document is available
> [here|https://docs.google.com/document/d/1k-kRadTcUbxVfUQwqBbIXs_yPZMxh18-e-cz77O_TaE/edit?usp=sharing].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]