[jira] [Commented] (FLINK-18738) Revisit resource management model for python processes.

Till Rohrmann (Jira) Tue, 11 Aug 2020 06:16:09 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-18738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175553#comment-17175553
 ]


Till Rohrmann commented on FLINK-18738:
---------------------------------------

Thanks for starting the discussion about this problem [~xintongsong]. I think 
it would be nice to solve this problem not only for Python but also for other 
languages we might want to support in the future (if possible and if it does 
not broaden the scope too massively).

I agree that the processing model is probably the most important aspect of the 
discussion since the configuration somewhat follows this decision.

The first question which comes to my mind is whether the {{TaskExecutor}} 
should be responsible for managing the component running the Python (or any 
other language) code or not. If there is no good reason for the 
{{TaskExecutor}} to manage this component, it could make the {{TaskExecutor}} 
simpler by not aggregating too many responsibilities. If the language execution 
service benefitted from a tight integration in the execution of a job (e.g. by 
allocating and releasing resources swiftly) on the other hand, then a tighter 
integration could make sense.

If the {{TaskExecutor}} is not responsible for managing the language execution 
component, then it could be the responsibility of the {{ActiveResourceManager}} 
or of the user to start the language execution service (e.g. in the case of a 
standalone deployment).

Whether a supported language supports code isolation or not should not 
necessarily decide which process model to choose. For example, a separate 
language execution service for Python (option 3.) could spawn individual Python 
processes for different jobs, slots or tasks to achieve isolation. Hence, I 
believe this is more of an implementation detail of the language execution 
service than necessarily a strict argument for any of the proposed processing 
models.

Concerning the proposed memory management: Does option 2. entail that one 
enables the cluster to run with Python at deployment time of the 
{{TaskExecutor}}? Differently asked will we reserve a fixed amount of memory 
for the Python process independent of whether we actually run a Python workload 
or not?

> Revisit resource management model for python processes.
> -------------------------------------------------------
>
>                 Key: FLINK-18738
>                 URL: https://issues.apache.org/jira/browse/FLINK-18738
>             Project: Flink
>          Issue Type: Task
>          Components: API / Python, Runtime / Coordination
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Major
>             Fix For: 1.12.0
>
>
> This ticket is for tracking the effort towards a proper long-term resource 
> management model for python processes.
> In FLINK-17923, we run into problems due to python processes are not well 
> integrate with the task manager resource management mechanism. A temporal 
> workaround has been merged for release-1.11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18738) Revisit resource management model for python processes.

Reply via email to