[jira] [Commented] (FLINK-16485) Support vectorized Python UDF in the batch mode of old planner

Dian Fu (Jira) Mon, 09 Mar 2020 02:08:11 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054774#comment-17054774
 ]


Dian Fu commented on FLINK-16485:
---------------------------------

[~ykt836]  [~hongfanxo] Thanks for sharing your thoughts about this feature.

The aim of this JIRA is to add the support of vectorized Python UDF for the 
batch mode of the legacy planner and it will not touch the DataSet API. I share 
the same thoughts with [~hongfanxo] that the legacy planner still has its value 
for now, e.g. a Table can be converted from/to DataSet and there are still a 
few features which are only available in the DataSet API, e.g. iterate which is 
widely used by ML users. The vectorized Python UDF is also a feature which 
maybe used by ML users and so I think it has value to add support of it for the 
batch mode of the legacy planner. 

As replacing the legacy planner with the blink planner is still on the way and 
there are still a lot of work to do, e.g. improving the DataStream API by 
adding the missing features which are only available in the DataSet API, it's 
important to consider the user experience and requirements of the existing 
users of the DataSet API/legacy planner during the transition. Although this 
may require some efforts, I think it's worth it. So I think it makes sense to 
add the support of this JIRA.

> Support vectorized Python UDF in the batch mode of old planner
> --------------------------------------------------------------
>
>                 Key: FLINK-16485
>                 URL: https://issues.apache.org/jira/browse/FLINK-16485
>             Project: Flink
>          Issue Type: Sub-task
>          Components: API / Python
>            Reporter: Dian Fu
>            Assignee: Dian Fu
>            Priority: Major
>             Fix For: 1.11.0
>
>
> Currently, vectorized Python UDF is only supported in the batch/stream mode 
> for the blink planner and stream mode for the old planner. The aim of this 
> Jira is to add support in the batch mode for the old planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16485) Support vectorized Python UDF in the batch mode of old planner

Reply via email to