[ 
https://issues.apache.org/jira/browse/FLINK-18738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176149#comment-17176149
 ] 

Dian Fu commented on FLINK-18738:
---------------------------------

Thanks a lot for the discussion.

Regarding to "One Python process for multiple operators in the same slot", 
there will be only one connection between the Java operators and the Python 
process. For each Java operator, a unique id will be generated for it. The 
Python process will forward the input data to the corresponding Python function 
according to the unique id. The execution results from the Python function will 
also be forwarded to the corresponding Java operator according to the unique id.

Regarding the performance concern, I agree that there are such problems for the 
solution to share the same Python process between multiple Java operators. 
However, I guess this is as expected and acceptable. As in this case, there is 
only one Python process for each slot and so at most one cpu core could be used 
because of GIL. It's slower because it uses less resource. Users could increase 
the parallelism if the performance becomes a problem.

I'm also in favor of `some sort of ExternalExecutionBackend` interface if this 
is feasible. Currently, the launching of Python process depends on Java 
libraries provided by Beam portability framework. Introducing such an interface 
could avoid introducing Beam dependencies in the Flink runtime. Besides, I 
think it will also simply the complexity and could easily extend to other 
language support or even other use cases.

> Revisit resource management model for python processes.
> -------------------------------------------------------
>
>                 Key: FLINK-18738
>                 URL: https://issues.apache.org/jira/browse/FLINK-18738
>             Project: Flink
>          Issue Type: Task
>          Components: API / Python, Runtime / Coordination
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>             Fix For: 1.12.0
>
>
> This ticket is for tracking the effort towards a proper long-term resource 
> management model for python processes.
> In FLINK-17923, we run into problems due to python processes are not well 
> integrate with the task manager resource management mechanism. A temporal 
> workaround has been merged for release-1.11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to