[
https://issues.apache.org/jira/browse/FLINK-16666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071640#comment-17071640
]
Wei Zhong commented on FLINK-16666:
-----------------------------------
Hi [~aljoscha],
Yes we don't really support executing Python UDFs in `flink-core`, `flink-java`
and `flink-streaming-java`. The code we added there is only used to define and
process the Python configurations.
First, there are many modules in our code base that support python, e.g. `SQL
DDL`, `flink-table-planner`, `flink-table-planner-blink`, `flink-client`,
`flink-sql-client`, etc. For simplicity let's call the modules that support
Python "python-related modules". The amount of python-related modules will
increase in the future, e.g. `flink-streaming-java`(for PyFlink DataStream API)
and `flink-container`(for k8s support).
The need for Python dependency management is widespread in any python-related
modules. To unify the interface of Python dependency management and decouple
the python-related modules from Python dependency management, we intend to use
configurations to store the Python dependency information. The configurations
of the information will be stored in the `Configuration` object of the
`ExecutionEnvironment/StreamExecutionEnvironment`. After entering the code of
the flink-python module, these configurations will be used to build the Python
environment.
Because any python-related modules need to read the definition of Python
ConfigOptions, we put the definition of Python ConfigOptions (i.e.
`PythonOptions` class) in `flink-core`, just like other config options. The
python-related modules also need to process these configurations (i.e. register
files to the distributed cache). For code reuse we process them in the
`configure()` method of `ExecutionEnvironment/StreamExecutionEnvironment`. We
can also do this via repeating the logic in each python-related module, or
putting the logic in `flink-python` and calling via reflection when needed, but
both of them seem not very clean.
> Support new Python dependency configuration options in flink-java,
> flink-streaming-java and flink-table
> -------------------------------------------------------------------------------------------------------
>
> Key: FLINK-16666
> URL: https://issues.apache.org/jira/browse/FLINK-16666
> Project: Flink
> Issue Type: Sub-task
> Components: API / Python
> Reporter: Wei Zhong
> Assignee: Wei Zhong
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.11.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)