[ https://issues.apache.org/jira/browse/FLINK-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054774#comment-17054774 ]
Dian Fu commented on FLINK-16485: --------------------------------- [~ykt836] [~hongfanxo] Thanks for sharing your thoughts about this feature. The aim of this JIRA is to add the support of vectorized Python UDF for the batch mode of the legacy planner and it will not touch the DataSet API. I share the same thoughts with [~hongfanxo] that the legacy planner still has its value for now, e.g. a Table can be converted from/to DataSet and there are still a few features which are only available in the DataSet API, e.g. iterate which is widely used by ML users. The vectorized Python UDF is also a feature which maybe used by ML users and so I think it has value to add support of it for the batch mode of the legacy planner. As replacing the legacy planner with the blink planner is still on the way and there are still a lot of work to do, e.g. improving the DataStream API by adding the missing features which are only available in the DataSet API, it's important to consider the user experience and requirements of the existing users of the DataSet API/legacy planner during the transition. Although this may require some efforts, I think it's worth it. So I think it makes sense to add the support of this JIRA. > Support vectorized Python UDF in the batch mode of old planner > -------------------------------------------------------------- > > Key: FLINK-16485 > URL: https://issues.apache.org/jira/browse/FLINK-16485 > Project: Flink > Issue Type: Sub-task > Components: API / Python > Reporter: Dian Fu > Assignee: Dian Fu > Priority: Major > Fix For: 1.11.0 > > > Currently, vectorized Python UDF is only supported in the batch/stream mode > for the blink planner and stream mode for the old planner. The aim of this > Jira is to add support in the batch mode for the old planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)