[
https://issues.apache.org/jira/browse/ARROW-14293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427853#comment-17427853
]
Weston Pace commented on ARROW-14293:
-------------------------------------
Wrapping the output of an ExecPlan in a python iterator is already done in the
scanner. The tricky part is constructing an ExecPlan from the user's wishes.
I'm a little hesitant to turn the datasets API into a full relational frontend.
At some point, as queries get more complicated, would it make more sense to
expose the API via SQL -> IR -> ExecPlan, bypassing datasets?
Also, wrong David :) [~lidavidm]
> [Python] Basic Join functionality in PyArrow
> --------------------------------------------
>
> Key: ARROW-14293
> URL: https://issues.apache.org/jira/browse/ARROW-14293
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Alessandro Molina
> Priority: Major
> Fix For: 7.0.0
>
>
> We want to expose a {{Table.join}} and {{Dataset.join}} functionalities in
> PyArrow which can leverage our join feature from the ExecPlan to expose.
> The {{Table.join}} can easily return a new {{Table}}, questions about what
> {{Dataset.join}} might return are more complex as it probably doesn't make
> much sense to return a new {{Dataset}} given that the result won't map to any
> files on disk
--
This message was sent by Atlassian Jira
(v8.3.4#803005)