[ 
https://issues.apache.org/jira/browse/FLINK-18738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175189#comment-17175189
 ] 

Xintong Song commented on FLINK-18738:
--------------------------------------

Hi all,

[~dianfu] and I had an offline discussion regarding the process model and 
resource management for python UDFs. Here are our outcomes and some open 
questions. We would like to collect some feedbacks on the general direction 
before diving into the design details.

h3. Process Model

Before discussing the memory management, it would be better to first get 
consensus on the long term process model for python UDFs. There are several 
options from our offline discussion and the previous discussions in FLINK-17923.
# *One python process per python operator.* This is the current approach. The 
operator is responsible for launching and terminating the python processes.
# *One python process per slot.* TaskManager is responsible for launching the 
python processes. A python process will be launched when the slot is created 
(allocated), and terminated when the slot is destroyed (freed).
# *One python process per TaskManager.* The deployment framework is responsible 
for launching the python processes. Then the python operators (in the java 
process) deploy the workload to the python processes.

Among the 3 options above, *Dian and I are in favor of option 2).*

*Problems for option 1)*
Low efficiency. In case of multiple python operators in one slot, launching one 
python process per operator will introduce significant overhead (framework, 
python VM, inter-process communication). In scenarios where the operators 
themselves do not take much resources, the problem become severer because the 
overhead takes more proportion in the overall resource consumption.

*Problems for option 3)*
Dependency conflict. Python operators from different jobs might be deployed 
into the same TaskManager. These operators may need to load different 
dependencies. If they are executed in the same python process, there could be 
dependency conflicts.

_Open questions_
* According to Dian’s input, python does not provide mechanism for dependency 
isolation (like class loaders in java). We need to double check on this.
* How do we handle potential conflicts between the framework and user code 
dependencies?

*Benefits for options 2)*
* Operators in the same slot would be able to share the python process. This 
should help reduce the overheads.
* A slot cannot be shared by multiple jobs, thus no need to worry about 
cross-job dependency conflicts.

h3. Memory Management

The discussion here is based on the assumption that we choose option 2) for the 
process model, which is still discussable.

Since python processes are dynamically launched and terminated, as slots 
created and destroyed, we would need the TaskManager rather than the deployment 
framework to managed the resources of python processes. Two potential 
approaches are discussed.

# *Make python processes use managed memory.* We would need a proper way to 
share managed memory between python processes and rocksdb state backend in 
streaming scenarios.
# *Introduce a new `python memory` to the TaskManager memory model for python 
processes.* The new python memory should adding to the overall pod/container 
memory, either aside from or as a part of TaskManager's total process memory.

*Dian and I prefer option 2),* for the following reasons.
* For option 1), it would be complicated to decide how to share managed memory 
before python and rocksdb. E.g., if user wants to give more memory to rocksdb 
while not changing the memory for python, he would need to not only increase 
the managed memory size, but also carefully tune how managed memory is shared 
(e.g., a fraction).
* According to Dian's input, it is preferred to configure absolute size of 
memory for python UDFs, rather than a fraction of the total memory. Managed 
memory consumers (batch operators and rocksdb) have a common characteristic 
that they can to same extend adapt to the given memory. The more memory, the 
better performance. On the other hand, resource requirements of python UDFs are 
more inflexible. The process fails if it needs more memory than the specified 
limit, and does not benefit from a larger-than-needed limit.

h3. Developing Plan

Assuming we decide to go along the proposed approaches
* process model option 2), and
* memory management option 2)

It would be good to separate these changes into two separated efforts. Trying 
to accomplish both efforts in 1.12 seems aggressive and we would like to avoid 
such rushing. Among the two efforts, the memory management changing is more 
user-faced. If we decide to change memory configurations for python UDFs, we'd 
better to do that early. Therefore, a potential feasible plan could be try to 
finish the memory management effort in 1.12, and postpone the process model 
changes to the next release.

_Open question_
* We are stilling looking for a plan to make the proposed new memory management 
option 2) work with the current process model option 1).


> Revisit resource management model for python processes.
> -------------------------------------------------------
>
>                 Key: FLINK-18738
>                 URL: https://issues.apache.org/jira/browse/FLINK-18738
>             Project: Flink
>          Issue Type: Task
>          Components: API / Python, Runtime / Coordination
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>             Fix For: 1.12.0
>
>
> This ticket is for tracking the effort towards a proper long-term resource 
> management model for python processes.
> In FLINK-17923, we run into problems due to python processes are not well 
> integrate with the task manager resource management mechanism. A temporal 
> workaround has been merged for release-1.11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to